Changes between Version 6 and Version 7 of UserGuidelines/Import/UpdateDetection


Ignore:
Timestamp:
09/13/12 10:07:56 (12 years ago)
Author:
Dominic König
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • UserGuidelines/Import/UpdateDetection

    v6 v7  
    3434If any matching records can be found, they will be ranked by:
    3535
    36   - an exact match of the first name (+2/-2 points)
    37   - an exact match of the last name (+2/-2 points)
    38   - an exact match of the date of birth (+2/-2 point)
     36  - an exact match of the first name
     37  - an exact match of the last name
     38  - an exact match of the date of birth
     39  - an exact match of the email address
     40  - an exact match of the mobile phone number
     41  - an exact match of the initials
    3942
    40   - an exact match of the email address (+1/-1 point)
    41   - an exact match of the mobile phone number (+1/-1 point)
    42   - an exact match of the initials (+1/-1 point)
     43(all case-insensitive)
    4344
    44 If any of the criteria is missing from either the existing record or the import item, the test will not be performed (0 points).
     45These criteria are weighted by a schema to satisfy a wide range of cases:
    4546
    46 If the sum of the points is less than 0, the record will not be regarded as a match.
     47  - first name: match +2, mismatch -2, missing from either record 0 points
     48  - last name: match +2, mismatch -2, missing from either record 0 points
     49  - date of birth: match +3, mismatch -2, missing from either record 0 points
     50  - email address: match +2, mismatch -5, missing from import item -2 if initials present or -3 if no email in the database or otherwise -4 points, missing from the database 0 points
     51  - initials: match +4, mismatch -1, missing from either record 0 points
     52  - mobile phone number: match +1, mismatch -1, missing from either record 0 points
    4753
    48 The highest ranking match will be used to identify the record to update.
     54  '''DEVELOPERS note:''' the exact schema needed for a deployment depends on the typical quality of the import data, which may vary. The more consistent and detailed the import items are, the safer the schema works. It is possible (and maybe necessary) to adjust these weights to particular situations by using a set of unit test cases like in PersonDeduplicateTests in modules/unit_tests/eden/pr.py.
     55
     56Examples:
     57
     58  Match (=total points > 0):
     59  - same first name and last, same email in both records (6 points)
     60  - same first name, last name and email address, different initials (5 points)
     61  - same first name and last name, no email in the database, but email in the import item (4 points)
     62  - same first name and last name, email in the database, no email in import item, matching date of birth (3 points)
     63  - same first name and last name, different email addresses, matching DOB (2 points)
     64  - same first name and last name, no email in either record (1 point)
     65
     66  Mismatch:
     67  - same first name and last name, email in the database, but no email in import item
     68  - same first name and last name, different email addresses, no further data
     69
     70The highest ranking match will be used to identify the record to update. Out of multiple matches with the same rank, the oldest record will be used.
    4971   - ''tbw''