Context Navigation

UpdateDetection

Timestamp:: 09/13/12 10:20:34 (13 years ago)
Author:: Dominic König
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

UserGuidelines/Import/UpdateDetection

-              v11
+              v12
 Person records are primarily identified by:
   - an exact match of first name and last name (if both are present in the import item), or
   - alternatively, an exact match of the initials (if present in the import item)
+  - an exact match of '''first name''' and '''last name''' (if both are present in the import item), or
+  - alternatively, an exact match of the '''initials''' (if present in the import item)
 If any matching records can be found, they will be ranked by:
 …
 These criteria are weighted by a schema to satisfy a wide range of cases:
   - first name: match +2, mismatch -2, missing from either record 0 points
   - last name: match +2, mismatch -2, missing from either record 0 points
   - date of birth: match +3, mismatch -2, missing from either record 0 points
   - email address: match +2, mismatch -5, missing from import item -2 if initials present or -3 if no email in the database or otherwise -4 points, missing from the database 0 points
   - initials: match +4, mismatch -1, missing from either record 0 points
   - mobile phone number: match +1, mismatch -1, missing from either record 0 points
+  - '''first name''': match +2, mismatch -2, missing from either record 0 points
+  - '''last name''': match +2, mismatch -2, missing from either record 0 points
+  - '''date of birth''': match +3, mismatch -2, missing from either record 0 points
+  - '''email address''': match +2, mismatch -5, missing from import item -2 if initials present or -3 if no email in the database or otherwise -4 points, missing from the database 0 points
+  - '''initials''': match +4, mismatch -1, missing from either record 0 points
+  - '''mobile phone number''': match +1, mismatch -1, missing from either record 0 points
   '''DEVELOPERS note:''' ''the exact schema needed for a deployment depends on the typical quality of the import data, which may vary. The more consistent and detailed the import items are, the safer the schema works. It is possible (and maybe necessary) to adjust these weights to particular situations by using a set of unit test cases like in {{{!PersonDeduplicateTests}}} in {{{modules/unit_tests/eden/pr.py}}}. However, it should not be expected that this schema can reliably detect any possible edge-case - as per its purpose it is much more important to maintain a manageable set of rules how data sources would have to indicate updates, and adapt the data sources to them.''