Changes between Version 13 and Version 14 of BluePrint/Importer
- Timestamp:
- 04/07/10 13:25:36 (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
BluePrint/Importer
v13 v14 19 19 * Methods of automatically (or with a user friendly interface) cleaning data (removing duplicate values with variations due to typos) - for example: 20 20 * If there were a list of countries which contained Indonesia, Spain, India, Indonesiasia, New Zealand, NZ, France, UK, Indonsia - the import may be able to identify whcih fields were duplicates, rather than adding 2 incorrect spellings for Indonesia. 21 * Also important for catching things like different spelling, punctuation or orders of words. 21 22 Ideally different templates will be able to be designed (by users) for importing different types of data. Machine learning algorithms with (multiple?) human verification could try parsing new data formats based on previous templates used. 22 23