wiki:BluePrint/SocialMedia/GHC2013SocialMediaHITProcessing

Version 18 (modified by Pat Tressel, 11 years ago) ( diff )

--

Introduction

  • Receive tweets and / or SMS messages from the public.
  • Dispatch these to online workers to classify and geocode.
  • Display on a map.

Background

During the Haiti earthquake of Jan 2010, people trapped in buildings sent SMS messages to a designated shortcode. These were classified, translated, and geocoded by online workers using Amazon's Mechanical Turk, then provided to emergency managers.

During the Kenya 2013 general election, citizens and trained election monitors reported election-related incidents via SMS and twitter. These were automatically entered into a map database, then vetted by online workers to remove spam and contact the sender for clarification, before making the information public. See: https://uchaguzi.co.ke/

During a Random Hacks of Kindness hackathon in 2010, a variant of this project was implemented using a Sahana Eden as the back end and a custom web page (not automatically generated by Eden) as the front end. This was designed as a training game -- workers got "experience points" and were awarded badges. See: http://gwob.org/101010-hackathon-winners/

Project breakdown

This project is intended to be easy to subdivide into tasks that can be worked on somewhat independently and in parallel, given the choice of a few naming conventions for new database tables and fields.

In order to keep our work together, and distinct from other work, we'll add a new module. This is the first step in added "human intelligence task" processing, in which results are verified by sending the same task to multiple workers, and comparing the results. So let's call our new module "hit". That means the controller file will be:

eden/controllers/hit.py

The model will be:

eden/modules/s3db/hit.py

The view pages will be in the directory:

eden/view/hit

Everyone may find it useful to refer to:

Set up incoming messages

Read the information on how messages are received by Eden from Twitter or SMS, and get test messages into an Eden instance.

Relevant documentation:

User interface: http://demo.eden.sahanafoundation.org/eden/msg/

Fill in required "new module" boilerplate

Look at the lesson on "making a new module" in the Eden book:
http://booki.flossmanuals.net/sahana-eden/_draft/_v/1.0/building-a-new-application/

That puts the model file in the eden/models directory, but that is just to avoid complication. Models in eden/models are loaded on every http request, whether they're needed or not. Most Eden models are in eden/modules/s3db, and are only loaded by http requests that need them. Since our message processing won't be used by most types of requests, we want it in eden/modules/s3db.

Add the new module to the list of enabled modules. This is normally specified in a "template" that has the customizations for a particular site. Here, we will "cheat" and just add the new module to our configuration file eden/models/000_config.py. Get the default module list from eden/private/templates/default/config.py i.e.
http://booki.flossmanuals.net/sahana-eden/_draft/_v/1.0/building-a-new-application/
Copy it to models/000_config.py and add an entry for the hit module.

Add database tables for message processing tasks

Received messages are stored in the "message log" table, msg_message.
https://github.com/flavour/eden/blob/master/modules/s3db/msg.py#L93

(This is a special kind of table called (in Eden terminology) a "superentity". This is like a superclass but for database tables. Records in multiple specialized tables have "parent" records in a shared superentity table, so other tables can refer to any of the specialized tables without needing a foreign key field for every one, by instead linking to the superentity record. References to ordinary non-superentity tables are simpler.)

We want to add a table that joins to this to hold the data entered by a worker for a message. That table will need fields for:

Why do we want a separate table? Why not just add a category and location to the msg_message table? Eventually, we want to do "human intelligence task" processing, in which results are verified by sending the same task to multiple workers, and comparing the results. So we may have more than one set of results for each message. We want to include which worker did each task, so we can check the quality of their work and refer them to more training if needed.

Add a controller function to generate task pages for workers

Look at other controllers in eden/controllers, and at the documentation for the controller helper function:

Add a view that presents a task to the worker and submits their work

Generate a list of categories from the database

We would like to encourage workers to use existing categories when there is a close enough match, but be able to add new ones if not. So, we want to give the worker a menu of categories to choose from, consisting of all the current categories found in the category field being added by the team working on adding the new tables, and also let the worker add a new category.

Note: See TracWiki for help on using the wiki.