Version 15 (modified by 12 years ago) ( diff ) | ,
---|
UUID conventions
Table of Contents
General
All records in Sahana Eden which shall be shared with other instances or applications must have a universally unique identifier (UUID).
In Sahana Eden, these record UUIDs must be ASCII strings.
Eden automatically generates a UUID for each record according to the specifications in RFC4122:
All record UUIDs generated by Sahana Eden are strings in URN notation:
urn:uuid:a46c1b3b-35c0-44c6-9142-bbe3a013039a
If a record is imported into Sahana Eden and it already has a UUID, then Eden will retain this UUID as-is (exception: local domain prefix, see below) - without any syntax validation (i.e. no specific UUID schema is required). However, for consistency reasons, we recommend to use URNs.
Some data formats (e.g. PFIF) may require a domain prefix for a UUID, followed by a slash:
sahanafoundation.org/urn:uuid:a46c1b3b-35c0-44c6-9142-bbe3a013039a
Where such a domain prefix is used and it matches the local domain of the importing Sahana instance, it will be removed during import.
URNs instead of UUIDs
Eden has just moved from UUIDs to URNs in order to enhance interoperability in multi-application scenarios like Haiti or Pakistan.
From experience we know that data exchange in the field can involve a variety of applications other than Eden, each implementing their own identifier schemes - and furthermore data sets which instead of application-specific IDs use officially assigned identifiers (e.g. PAHO IDs for health facilities in Haiti). Implementation of URNs will add support for both multiple different identifier schemes, as well as cross-application common namespaces and ID schemas (as favorable e.g. for geolocations or personal data).
In practise, that means:
- there should be a common namespace for sahana applications, at best "sahana"
- uuid="eden.sahanafoundation.org/XXXX-YYYY" would become something like uuid="urn:sahana:eden.sahanafoundation.org/XXXX-YYYY"
- Eden can support other namespaces, by making the namespace a configurable attribute of the "uuidstamp" reusable field
Mapping
We need an agreed set of UUIDs for GIS Data so that we can share data more easily across systems, such as the current Pakistan data
- OpenStreetMap IDs can change over time if records are deleted/recreated
- They can hold additional uid/uuid fields though
- Geonames data isn't free enough for OSM: http://wiki.openstreetmap.org/wiki/Geonames
- Geonet can be: http://wiki.openstreetmap.org/wiki/GEOnet_Names_Server
- Yahoo WoE data is in the public domain:
- Ushahidi doesn't have a common set of IDs across instances
- No space for a UUID either?
- Longer-term we need a common central repository of UUIDs that is held for the common good.
- Propose that Sahana start this off & then give up ownership/branding later:
- Sahana can use UUIDs of format: http://geo.sahanafoundation.org/<ID>
- These are associated with an OSM ID & Geonames ID for cross-correlation
- The Source ID field shows the source, so that OSM export can filter out sources like Geonames
- IDs from Ushahidi instances can be appended to the comments field in Sahana