|Version 83 (modified by 13 years ago) ( diff ),|
Table of Contents
Blueprint: VITA Person Entity Model
Version 2.0 - under construction
VITA is the proposed application design for storage, manipulation and communication of person-related information in SahanaPy. It defines a set of data structures, interfaces and interoperability features to be implemented.
- an extensible data model for tagging, tracking and tracing of person entities
- a set of interfaces to request and manipulate person entity data
- an XML-based data format to communicate person entity data
VITA defines four axes (aspects) for the construction of "Personal Data":
- Entity - the entity type the data set relates to
- Distinction - assigned features for operational distinction
- Finding - features to be communicated by the data set
- Evidence - to describe the quality of the data
Any compliant set of personal data implements at least one feature for each axis.
The entity axis determines the entity type to which the data set relates (=ownership), which can be types of:
- Personal Effects
The distinction axis describes any features which are used to distinguish particular person entities.
It is important to understand that distinction features in the VITA context are always assigned features, i.e. not naturally given - they cannot be observed but have to be communicated. The most commonly used distinction feature for person entities are names.
Distinction features may be ambiguous, even within the same domain (e.g. database). However, it is desirable to assign at least one unique feature in order to identify person entities. Such unique features are called Identity (ID).
Identities don't have to be unique across multiple domains. In case they are not, any communications of identities have to enclose a reference to the domain which administrates the particular identity (administrative domain).
Findings are all other features of the person entity which are to be communicated in this data set. These features can be assigned, observed or derived.
In the VITA context, photographs or other images are findings, not evidence.
Evidence is any suitable means to describe the quality of data of the findings.
This always includes at least information on origin and time of the particular finding, but can also include references and links to sources, or information on methods of determination.
The person has red hair.
is incomplete - it contains the "Entity" (=the person) and a "Finding" (=has red hair), but no operational distinction and no evidence information.
Missing person, given name: Michael, family name: M., eye color: blue
is also incomplete - it contains the "Entity" (=person), "Distinction" (=given name and family name) and two "Findings" (=is missing, eye color), but no evidence information.
Note, that a "missing person" is not a person entity in the VITA context, but an entity (=person) plus a finding (=is missing). Implementations must handle this explicitly and consistently - e.g., a "missing person" as entity can never have the status "found", because this is a contradiction.
<pfif:person> <pfif:person_record_id>salesforce.com/a0030000001TRYR</pfif:person_record_id> <pfif:entry_date>2005-09-03T09:21:12.321Z</pfif:entry_date> <pfif:author_name>Bill Mandil</pfif:author_name> <pfif:author_email>firstname.lastname@example.org</pfif:author_email> <pfif:source_name>salesforce.com</pfif:source_name> <pfif:source_date>2005-09-03T09:21:12Z</pfif:source_date> <pfif:source_url>http://www.salesforce.com/person/a0030000001TRYR</pfif:source_url> <pfif:first_name>Katherine</pfif:first_name> <pfif:last_name>Doe</pfif:last_name> <pfif:sex>female</pfif:sex> <pfif:date_of_birth>1971-02</pfif:date_of_birth> <pfif:age>30-45</pfif:age> <pfif:photo_url>http://flickr.com/photo/12345678.jpg</pfif:photo_url> </pfif:person>
is a compliant set - it contains the "Entity" (pfif:person), "Distinction" (pfif:person_record_id, pfif:first_name, pfif:last_name), several "Findings" (pfif:sex, pfif:date_of_birth, pfif:age, pfif:photo_url) and plenty of "Evidence" (pfif:author_name, pfif:source_name, etc.).
All data sets describing one and the same instance of a person entity form a file.
For disambiguation, files are sometimes also called cases.
All data in a file must be accessible from a central point within the application (the "repository"). Especially the following must be implemented:
- status of the file as a whole
- opening/closing the file for write access
- permanent removal of the file as a whole from the repository
- a directory of all available record types in that file
- export of a file as a whole
- import of a file as a whole
- merging of files
File Level Auditing
VITA implementations must be able to log all successful attempts to access and/or manipulate data in a file, even if the application does not require auditing. All logs, regardless of their actual granularity, must be available as a function of the file.
Data which will naturally change over time (e.g. the location of a person) must additionally be version-tracked in such way that they can be reconstructed for any point or interval of time.
Data which only change upon administrative measures (e.g. names) do not need to be version-tracked.
The Data Model
Some (atomic) features in personal data sets are not applicable to all instances of the set (e.g. "Occupation" not applicable for infants), thus undefined values can be ambiguous here. For disambiguation, VITA implementations handle undefined values always as "not available", whereas "not applicable" must be defined explicitly and must not be the default value.
Any person entity can be assigned to roles, which qualify the entity for inclusion into certain processes. One and the same instance of the entity can be assigned to multiple roles at the same time.
Within a role, the entity instance can be assigned to a status, where the status values must be discrete, unambiguous and consistent. There can only be one status per instance and role at a time. VITA implementations should keep a tally of all status transitions of an entity instance in such way that it is possible to reconstruct the role and status of the entity instance for any point or interval of time.
Presence means the fact, that an entity is physically there at a particular location at a particular time. This abstract construction covers even cases of explicit absence, i.e. when the entity is not present at that location at that time.
VITA applications record a presence log for each person entity. Each log entry is established through a recognition event, i.e. when the entity is recognized by a dedicated observer (human or automatic).
Recognition might happen implicitly, e.g. when other data are obtained directly from the entity. In these cases, the time and location of the observer are used for the record. Recognition can also happen explicitly, i.e. in response to observation requests. In such cases, the observation time and location determined by the request are used.
Locations in presence records represent geospatial features, which can basically be of any valid feature class (points, polygons, lines etc.). Where references to locations are used, they must be universally unique and resolvable; otherwise the referred geospatial information have to be enclosed in the database and/or communication.
In contrast to that, time in this regard means one exact time point, to be stored as coordinated universal time (UTC, Gregorian date including century) with a minimum precision of one second. References or relative times must not be used.
Check-In arriving at this location for storage/accommodation Check-Out released from this location after storage/accommodation Reconfirmation re-confirmation of storage/accommodation at this location (on request) Found only temporarily at this location, accidentally found Procedure temporarily at this location for a procedure Transit temporarily at this location between two transfers (specify origin and destination) Transfer released from this location to be transferred to another (specify destination) Missing no longer present at this location, but missing Lost no longer at this location, but destroyed/disposed/deceased here
Names are used to identify persons in human communication. However, names of persons do not have to be unique nor do they have to be exact, and thus are no reliable identity features. The following name fields are mandatory to exist in person records:
- first_name the first names (or the only name) of the person, in romanized script
- middle_name the person's middle name (if customary), in romanized script
- last_name the last name (mostly the family name) of the person, in romanized script
- local_name the full name of the person, in a local language and script
- Any of these mandatory name fields may be empty in a particular record except first_name.
- A name field value starting with a question mark ? indicates that this part of the name is uncertain or unknown.
- 'first', 'middle' and 'last' refer rather to the usual writing order of a person's full name than to the meaning of the name parts.
- There is no need to split the full name into segments if that is not customary in the person's country of origin - in such case first_name should represent the person's full name.
- In case there are multiple local languages, the local_name should be in the language/script that is most likely readable by that person and/or their relatives.
- An application may define additional name fields to represent e.g. titles, nicknames or pseudonyms, however, these additional fields must not be used to repeat or replace any of the mandatory fields.
- SahanaPy Person Management (in progress)