wiki:S3/S3XML

Context Navigation

Version 15 (modified by Dominic König, 14 years ago) ( diff )
--

S3XML

S3XML is a generic RESTful data exchange interface for the S3 framework.

It comes with a genuine XML data format, but also provides built-in data format conversion and transformation to support a variety of custom XML, JSON and CSV formats and schemas.

Minimum Requirements for Implementation

Clients

Interfaces which want to exchange data with S3XML interfaces must implement the following:

an HTTP client which can perform GET and POST requests
the native S3XML data format

Note:

Where the target interface has built-in support for data format conversion/transformation (as in S3), it is sufficient if the client implements an S3XML-compatible data format (XML, JSON or CSV).
S3 comes with a number of built-in transformation stylesheets for some standard data formats. Where other formats shall be used, clients can also provide their own XSLT transformation stylesheets.

Servers

Interfaces which want to provide S3XML server capabilities (e.g. for Synchronization) must implement the following:

an HTTP server interface accepting and performing GET, PUT and POST requests
the RESTful API as described in this document
the native S3XML data format

Optionally they can provide:

JSON/CSV to S3XML conversion
S3XML to JSON/CSV conversion
XSLT-1.0 transformation

Conventions

Name Space

Where a name space identifier for the native S3XML format is to be used, it shall be:

"http://eden.sahanafoundation.org/S3XML"

In the current implementation of S3, no name space identifier shall be used. This is though subject to change in future versions.

Character Encoding

XML documents to be used for S3XML can specify their character encoding in the XML header.

Where JSON or CSV formats are used, they are expected to be UTF-8 encoded. S3XML interfaces can support other encodings for JSON/CSV, but this is not a requirement.

All exported data are always UTF-8 encoded.

URL format

Data format extensions in URLs must be all-lowercase. Where uppercase characters are used, they are converted into lowercase.

Interface

XML Format

Types of Sources

S3XML provides 3 types of sources:

Schema

Schema documents describe the data schema for a resource. Clients can use these documents e.g. for automatic generation of forms.

Note:

In the current S3 implementation, these documents can only be requested. Future versions may also accept submissions of such documents to update the database schema.

Field Options

Field options documents describe the currently acceptable options for fields in a resource. Clients can use these documents e.g. for automatic generation and/or client-side validation of forms.

Note:

In the current S3 implementation, transformation of field option documents is not supported. JSON conversion is possible, though.
Field option documents can only be requested (GET). Future implementations may also accept submissions of such documents to update the data schema.

Data

Data documents provide the contents (data) of resources.

Element Descriptions

S3XML defines 4 element types:

s3xml

Parent elements	none (root element)
Child elements	resource
Contents	empty

The root element.

Attributes:

Name	Type	Description	mandatory?
domain	string	the domain name of the data repository	no

resource

Parent elements	s3xml, resource, reference
Child elements	resource
Contents	empty

Represents a record.

Attributes:

Name	Type	Description	mandatory?
name	string	the name of the resource, usually the DB table name	yes
uuid	string	a unique identifier for the record	no*
tuid	string	a temporary unique identifier for the record	no*

(*) Records will be identified within the input file by their uuid, or, if no uuid is specified, by their tuid.

The uuid will be stored in the database together with the record. If uuid is present and matches an existing record in the database, then this record will be updated. If there's no match or no uuid specified in the resource element, then the importer will create a new record in the database (and automatically generate a uuid if required).

data

Parent elements	resource
Child elements	none (leaf element)
Contents	Text

Represents the value of a single field in the record.

Attributes:

Name	Type	Description	mandatory?
field	string	the field name in the record	yes
value	JSON value	the native field value	no
url	URL	the URL to download the contents from*	no
filename	filename	the filename of the attached contents*	no

The text node in the data element provides a human-readable representation of the field value. If this representation is different from the original value in the database, then the original value must be provided by the value attribute.

(*) If the field is for file upload, a url attribute should be provided to specify the location of the file. The importer will try to download and store the file (file transfer) from that URL (pull). It is also possible to send the file with the HTTP request - in this case the filename must be specified instead of url (push). The push variant for uploads is meant for peers which do not support pulling for some reason (e.g. mobile phones). Normal servers would always provide a URL for download in order to allow the consuming site decide which files to download and when (saves bandwidth).

reference

Parent elements	resource
Child elements	resource
Contents	Text

Represents a foreign key reference.

Attributes:

Name	Type	Description	mandatory?
field	string	the field name in the record	yes
resource	string	the name of the referenced resource, usually the tablename	yes
uuid	string	the unique identifier of the referenced record (foreign key)*	(yes)**
tuid	string	a temporary identifier for a referenced record (foreign key)*	(yes)**

(*) Referenced records would always be exported in the same output file. If a referenced record is found in the same input file, then it will be automatically imported.

(**) Records will be identified within the input file by their uuid, or, if no uuid is specified, by their tuid.

If the referenced record is enclosed in the reference element, then uuid and tuid can be omitted:

<s3xml>
   <resource name="xxxyyy">
       <reference field="xy" resource="aaabbb">   <!-- the reference element, uuid/tuid can be omitted if -->
          <resource name="aaabbb">                <!-- the referenced record is enclosed in the reference -->
          </resource>
       </reference>
   </resource>
</s3xml>

JSON Format

CSV Format

Examples

XML Format

<s3xml>

  <resource                                                 <-- a record in the database -->
      created_on="2009-10-02 08:55:11"                      <-- date/time when the record was created -->
      modified_on="2009-10-02 08:56:03"                     <-- date/time when the record was last modified -->
      uuid="6e6e76dc-8ed7-408c-bb09-54476e3944ae"           <-- UUID of the record (if present in DB) -->
      created_by="None"                                     <-- Author -->
      modified_by="Dominic"                                 <-- Last Author -->
      name="pr_person">                                     <-- Resource Name -->

    <reference                                              <-- Reference Field (foreign key) in the record -->
      field="pr_pe_id"                                      <-- Field name -->
      resource="pr_pentity"                                 <-- Name of the referenced resource -->
      uuid="6e6e76dc-8ed7-408c-bb09-54476e3944ae"/>         <-- UUID of the referenced entry -->

    <data field="pr_pe_label">730421</data>                 <-- A field in the record -->
    <data field="first_name">Dominic</data>
    <data field="middle_name"/>
    <data field="last_name">König</data>
    <data field="preferred_name"/>
    <data field="local_name"/>
    <data field="opt_pr_gender" value="3">male</data>
    <data field="opt_pr_age_group" value="5">Adult (21-50)</data>
    <data field="email">dominic@nursix.org</data>
    <data field="mobile_phone"/>
    <data field="date_of_birth">1973-04-21</data>
    <data field="opt_pr_nationality" value="65">Germany</data>
    <data field="opt_pr_country" value="167">Sweden</data>
    <data field="opt_pr_religion" value="1">none</data>
    <data field="opt_pr_marital_status" value="3">married</data>
    <data field="occupation">Nurse</data>
    <data field="comment"/>

    <resource                                               <-- A sub-resource (component) of the record -->
      created_on="2009-10-02 11:34:34"
      modified_on="2009-10-02 11:34:34"
      uuid="89217054-3c10-4f5d-959a-420254243498"
      name="pr_address">

      <data
        field="opt_pr_address_type"                         <-- field name -->
        value="1">                                          <-- original value in the database -->
          Home Address                                      <-- value represented for human readability -->
      </data>
      <data field="co_name"/>
      <data field="street1">Lundgatan</data>
      <data field="street2"/>
      <data field="postcode">38031</data>
      <data field="city">Läckeby</data>
      <data field="state"/>
      <data field="opt_pr_country" value="167">Sweden</data>
      <data field="lat">56.78042</data>
      <data field="lon">16.27914</data>
      <data field="comment"/>
    </resource>
  </resource>
</s3xml>

UUID - how we handle Unique IDs for records across heterogeneous systems

JSON Format

The data structure of the native S3JSON format is equivalent to the XML format (=element trees) - except that markup elements are represented by prefixes:

{
    "@domain": "yana",                                             // Server name
    "@url": "http://127.0.0.1:8000/eden"                           // Server URL
    "$_pr_person": {                                               // Resource, prefix: $_
        "@uuid": "44fc762e-02df-44e0-8bd1-9b58e3132894",           // Resource attribute, prefix: @
        "@url": "http://127.0.0.1:8000/eden/pr/person/1",
        "@created_on": "2009-11-16 22:33:35",
        "@created_by": "None",
        "@modified_on": "2009-11-19 21:32:19",
        "@modified_by": "Dominic",
        "first_name": "Dominic",                                   // Data field, no prefix
        "last_name": "K\u00f6nig",
        "email": "dominic@nursix.org",
        "opt_pr_age_group": {"@value": "1", "$": "unknown"},       // Data field with textual representation:
        "opt_pr_religion": {"@value": "1", "$": "none"},           // @value=Value, $=TextualRepresentation
        "opt_pr_gender": {"@value": "1", "$": "unknown"},
        "opt_pr_nationality": {"@value": "999", "$": "unknown"},
        "opt_pr_country": {"@value": "999", "$": "unknown"},
        "opt_pr_marital_status": {"@value": "1", "$": "unknown"},
        "$k_pr_pe_id": {                                           // External Reference (Key), prefix: $k_
            "@resource": "pr_pentity",                             // Key resource name
            "@uuid": "a2a945bd-4f43-41da-bcdb-e2e638a987ea",       // UUID of the key record
            "$": "Dominic K\u00f6nig [no label] (Person)"          // Textual representation of the reference
        },
        "$_pr_presence": {                                         // Sub-resource (Component):
            "@uuid": "14af2751-7277-4e90-b42b-0d0430684561",       // appears as component within the resource
            "@created_on": "2009-11-19 19:42:46",
            "@modified_on": "2009-11-19 19:42:46"
            "@url": "http://127.0.0.1:8000/eden/pr/person/1/presence/1",
            "opt_pr_presence_condition": {"@value": "4", "$": "Found"},
            "time": {"@value": "2009-11-19 18:42:00 +0000", "$": "2009-11-19 20:42:00"},
            "$k_reporter": {
                "@resource": "pr_person",
                "@uuid": "44fc762e-02df-44e0-8bd1-9b58e3132894",
                "$": "Dominic K\u00f6nig"
            },
        }
    },
}

JSON format characteristics:

The JSON output contains _no_ whitespace between elements, it's just added here by hand for better readability

The outermost structure is always a JSON object (not a list)
All data is represented as strings (for security reasons)

If @value is sent for a field, it overrides the element text ($) at import
however, the use of @value is not mandatory, data can simply be placed instead of element text
Note that there is no automatic data encoding: data must be sent in DB-encoded format
@resource, @name and @uuid attributes are mandatory at input, other attributes can be omitted

Multiple records of the same resource will be aggregated as lists like:

{
    $_my_resource: [
        {
            // record1 of my_resource
        }
        {
            // record2 of my_resource
        }
    ]
}

Attachments (1)

ConversionTransformation.png (24.0 KB ) - added by Dominic König 14 years ago.

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text

Context Navigation

Table of Contents

S3XML

Minimum Requirements for Implementation

Clients

Servers

Conventions

Name Space

Character Encoding

URL format

Interface

XML Format

Types of Sources

Schema

Field Options

Data

Element Descriptions

s3xml

resource

data

reference

JSON Format

CSV Format

Examples

XML Format

JSON Format

Attachments (1)

Download in other formats: