Changes between Version 1 and Version 2 of BluePrint/DataRepository
- Timestamp:
- 12/26/14 04:49:14 (10 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
BluePrint/DataRepository
v1 v2 1 Data Depository tools such as CKAN are becoming popular within the humanitarian aid space.1 Data repository tools such as [http://ckan.org CKAN] are becoming popular within the humanitarian aid space as evidence by projects like [https://data.hdx.rwlabs.org/ HDX] and [http://www.data.gov/disasters/ Data.Gov's disaster portal]. 2 2 3 The y allow people share data sets within a searchable environment, making it easier for users to find raw data sets.3 These tools allow users to publish data sets and associate them with metadata that enables others to easily find them. This is particularly useful for organizations that receive and produce lots of raw and refined data sets. Many of these organizations are also collecting data sets that they will then integrate into their own information management systems. Sometimes the data they organize in their information management systems is also data they want to make available in a raw format via a data repository. 4 4 5 This is particularly useful for organizations that receive lots of raw data, want to make that data available as quickly as possible to their stakeholders, while also integrating that data into their own information management system.5 Since Sahana produces the type of information management systems into which people want to integrate data they collect, it makes sense for Sahana to provide data repository functionality that would enable users to publish datasets and metadata that follows the [http://www.w3.org/TR/vocab-dcat/ DKAT standard] and is accessible via API. 6 6 7 Since Sahana produces the type of information management systems into which people want to integrate raw data, it makes sense for it to support the process of making that raw data, and other data as well, available to Sahana users. 7 It's likely this data would fall into a few categories: 8 * raw datasets collect (ex. information about medical clinics collected by workers in the field) 9 * polished datasets (ex. medical clinics from WHO) 10 * datasets produced by the Sahana system (ex. all medical facilities being managed in the Sahana system) 11 * documents and reports (ex. PDF of reports and supplemental spreadsheet information) 12 8 13 9 The basic idea is to create a "data repository module" that would perform some of the key functions that CKAN does. Namely:14 The basic idea is to create a "data repository module" that would perform some key functions: 10 15 11 16 * Publish Data … … 26 31 * they can access metadata information via API 27 32 28 Schema Ideas: 29 Title 30 Formats 31 Author 32 Date/Time Submitted 33 Submitted through (channel) 34 Date/Time Updated 35 Updated By 36 Purpose 37 Permissions: Public, View Metadata, Private 38 Status: New, Processing (+ manager), Integrate (+reference_link, +note) 39 Manager 40 Accessibility Note 41 General Note 33 Potential Schema: 34 * Title 35 * Data Formats 36 * Original Author (individual, organization or group) 37 * Date/Time Submitted 38 * Submitted through (channel) 39 * Date/Time Updated 40 * Updated By (individual, organization or group) 41 * Purpose 42 * Permissions: Public, View Metadata, Private 43 * Status: New, Processing (+ manager), Integrate (+reference_link, +note) 44 * Manager (Sahana user managing this data set) 45 * Accessibility Notes 46 * General Notes 47 * Change Log 48 * Comments