wiki:BluePrint/SocialMedia/GHC2013SocialMediaHITProcessing

Version 9 (modified by Pat Tressel, 11 years ago) ( diff )

--

Introduction

  • Receive tweets and / or SMS messages from the public.
  • Dispatch these to online workers to classify and geocode.
  • Display on a map.

Background

During the Haiti earthquake of Jan 2010, people trapped in buildings sent SMS messages to a designated shortcode. These were classified, translated, and geocoded by online workers using Amazon's Mechanical Turk, then provided to emergency managers.

During the Kenya 2013 general election, citizens and trained election monitors reported election-related incidents via SMS and twitter. These were automatically entered into a map database, then vetted by online workers to remove spam and contact the sender for clarification, before making the information public. See: https://uchaguzi.co.ke/

During a Random Hacks of Kindness hackathon in 2010, a variant of this project was implemented using a Sahana Eden as the back end and a custom web page (not automatically generated by Eden) as the front end. This was designed as a training game -- workers got "experience points" and were awarded badges. See: http://gwob.org/101010-hackathon-winners/

Project breakdown

This project is intended to be easy to subdivide into tasks that can be worked on somewhat independently and in parallel, given the choice of a few naming conventions for new database tables and fields.

Set up incoming messages

Read the information on how messages are received by Eden from Twitter or SMS, and get test messages into an Eden instance.

Relevant documentation:

User interface: http://demo.eden.sahanafoundation.org/eden/msg/

Add database tables for message processing tasks

Received messages are stored in the "message log" table, msg_message.
https://github.com/flavour/eden/blob/master/modules/s3db/msg.py#L93

(This is a special kind of table called (in Eden terminology) a "superentity". This is like a superclass but for database tables. Records in multiple specialized tables have "parent" records in a shared superentity table, so other tables can refer to any of the specialized tables without needing a foreign key field for every one, by instead linking to the superentity record. References to ordinary non-superentity tables are simpler.)

We want to add a table that joins to this to hold the data entered by a worker for a message. That table will need fields for:

Why do we want a separate table? Why not just add a category and location to the msg_message table? Eventually, we want to do "human intelligence task" processing, in which results are verified by sending the same task to multiple workers, and comparing the results. So we may have more than one set of results for each message. We want to include which worker did each task, so we can check the quality of their work and refer them to more training if needed.

Add a controller to generate task pages for workers

Add a view that presents a task to the worker and submits their work

Generate a list of categories from the database

We would like to encourage workers to use existing categories when there is a close enough match, but be able to add new ones if not. So, we want to give the worker a menu of categories to choose from, consisting of all the current categories found in the category field being added by the team working on adding the new tables, and also let the worker add a new category.

Note: See TracWiki for help on using the wiki.