Changes between Version 13 and Version 14 of BluePrint/Importer


Ignore:
Timestamp:
04/07/10 13:25:36 (12 years ago)
Author:
Michael Howden
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • BluePrint/Importer

    v13 v14  
    1919 * Methods of automatically (or with a user friendly interface) cleaning data (removing duplicate values with variations due to typos) - for example:
    2020  * If there were a list of countries which contained Indonesia, Spain, India, Indonesiasia, New Zealand, NZ, France, UK, Indonsia - the import may be able to identify whcih fields were duplicates, rather than adding 2 incorrect spellings for Indonesia.
     21  * Also important for catching things like different spelling, punctuation or orders of words.
    2122Ideally different templates will be able to be designed (by users) for importing different types of data. Machine learning algorithms with (multiple?) human verification could try parsing new data formats based on previous templates used.
    2223