Changes between Version 30 and Version 31 of BluePrint/Importer


Ignore:
Timestamp:
01/22/11 12:08:21 (14 years ago)
Author:
Pat Tressel
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • BluePrint/Importer

    v30 v31  
    124124==== Data structure uses cases: ====
    125125
    126 (This is only about the source schema, not the CSV representation.)
     126(This is only about the source schema, not the CSV representation.) In general, a
     127normalized relational schema is a directed acyclic graph with a possible exception
     128for self-cycles (a reference from a table to itself). Collections of records and
     129their key references can also form a DAG. (There should not be cycles in the
     130key references even if there are self-references within one table -- it is always
     131possible to avoid cycles among records by using relationship tables that
     132have outlinks to all the participants.)
    127133
    128134- A flat table -- one resource with no components or foreign key references.
     
    150156  files above.
    151157
    152 - A single file with a recursive outer join of all the tables.
    153   For 1-N, the data on the "1-"
     158- A single file with a recursive outer join of all the tables -- that is,
     159  a "flattened" representation of the tables.  For 1-N, the data on the "1-"
    154160  side is repeated in each row along with the separate records of the -N
    155161  side.  For M-N, either side may be replicated across multiple lines in the
     
    172178- Any combination of the above.
    173179
    174 === Schema mapping ===
     180=== Specifying the schema mapping ===
     181
     182- If the data uses our formatting, we don't need a schema mapping -- we just need
     183  to be told it's our formatting.
    175184
    176185- If the source has a schema that does not match ours, a means of mapping from the
    177   source's schema to ours will be needed, or will have to be inferred.
     186  source's schema to ours will be needed (or will have to be inferred).
    178187  (It is likely, for an existing major source, that we would write the schema mapping.
    179188  For such a source, if we were receiving updates from them regularly, we would want
     
    181190  on regularly, there may be better means of pulling data than CSV files...)
    182191
    183 - If the data uses our formatting, we don't need a schema mapping -- we just need
    184   to be told it's our formatting.
    185192
    186193