139 | | Separate files per table with key references to link entries across tables. |
140 | | The keys can either be existing Eden database keys (for updating existing |
141 | | records), or scratch keys (not stored as ids in any other database, only |
142 | | used to associate dependent records for this upload, or external database |
143 | | keys (i.e. actual keys in the source database, which we might want to |
144 | | preserve for future updates.) This can easily represent any valence of |
145 | | relationship. |
| 139 | === Possible CSV formats we might receive === |
147 | | One file with separate sections, equivalent to concatenating the separate |
148 | | files above. |
| 141 | - Separate files per table with key references to link entries across tables. |
| 142 | This can easily represent any valence of relationship, and is much like a |
| 143 | spreadsheet with multiple linked sheets. |
| 144 | The keys might be: |
| 145 | - Existing Eden database keys (for updating existing records). |
| 146 | - The external source's keys (i.e. actual keys in the source database, which |
| 147 | we might want to preserve for future updates.) |
| 148 | - Scratch keys that the source includes to describe the structure |
| 149 | (i.e. not stored as keys in their database, only used to associate related |
| 150 | records for this upload. |
150 | | A single file with an outer join of all the tables. For 1-N, the data on the 1- |
151 | | side is reapeated in each row along with the separate records of the -N |
152 | | side. For M-N, either side may be replicated across multiple lines in the |
153 | | file, as needed. For a deeper hierarchy, the common records are repeated |
154 | | as needed. This is just a standard outer join. If there is a large fanout |
155 | | (1-lots of records) then could "compress' records by including one full copy |
156 | | of a record, then just its key field with non-key fields left empty. This can |
157 | | represent any valence of relationship at the expense of some extra storage. |
158 | | It has the advantage that related pieces are easy to identify, and it's not |
159 | | necessary for them to be in any specific order, except that if the above |
160 | | compression is used and some fields are required to be non-null, then |
161 | | it's simpler if the complete record is available before the partial records. |
| 152 | - One file with separate sections, equivalent to concatenating the separate |
| 153 | files above. |
163 | | A flat file with embedded structure -- that is, cells that contain records, |
164 | | or multiple items or records. A simple example is a cell that contains a list |
165 | | of strings, or a collection of key=value pairs. Or even xml... |
| 155 | - A single file with a recursive outer join of all the tables. |
| 156 | For 1-N, the data on the "1-" |
| 157 | side is repeated in each row along with the separate records of the -N |
| 158 | side. For M-N, either side may be replicated across multiple lines in the |
| 159 | file, as needed. For a deeper hierarchy, the common records are repeated |
| 160 | as needed. This is just a standard outer join, so is easy for the remote |
| 161 | source to produce if they have their data in a relational database. |
| 162 | (If there is a large fanout, i.e. |
| 163 | 1-(lots of records), then could "compress' records by including one full copy |
| 164 | of a record, then just its key field with non-key fields left empty. This can |
| 165 | represent any valence of relationship at the expense of some extra storage. |
| 166 | It has the advantage that related pieces are easy to identify, and it's not |
| 167 | necessary for them to be in any specific order, except that if the above |
| 168 | compression is used and some fields are required to be non-null, then |
| 169 | it's simpler if the complete record is available before the partial records. |