wiki:Event/2013/GSoC/TextSearch

Context Navigation

Version 23 (modified by Vishrut Mehta, 12 years ago) ( diff )
--

Project: Full-Text Search

Name : Vishrut Mehta
Mentor: Pat Tressel

Proposal

The proposal for the project is here:
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/vishrutmehta/6001

BluePrints

This project draws ideas from the Blueprints below:

BluePrint/TextSearch

Meetings And Discussions

Weekly Meeting : Tuesday and Saturday 04:30 UTC

Venue : IRC
Nick - vishrut009

Google Group Discussions :

Description of Work Done

Pylucene
- Installation details are here: http://eden.sahanafoundation.org/wiki/BluePrint/TextSearch#Pylucene[[BR]]

Apache Solr
- The installation details are here: http://eden.sahanafoundation.org/wiki/BluePrint/TextSearch#ApacheSolr[[BR]]

Sunburnt
- The script Attached below installs the dependencies and also configures and installs Apache Solr and Sunburnt

Enabling text search:

-> Uncomment the following line in models/000_config.py

# Uncomment this and set the solr url to connect to solr server for Full-Text Search
settings.base.solr_url = "http://127.0.0.1:8983/solr/"

Asynchronously Indexing and Deleting Documents:

The code for asynchronously indexing documents is in models/tasks.py
Insertion: The code will first insert the document into the database. Then in callback onaccept it will index those documents calling the document_create_index() function from models/tasks.py . The following code should be added for enabling Full Text search for documents in any modules. The example is there, you can see modules/s3db/doc.py in document_onaccept() hook.
Deletion: The code will first delete the record from the database table, then will select that file and will delete it from Solr also, by deleting its index which is stored in solr server. You can look for the code in modules/s3db/doc.py in document_ondelete() hook.

In model()


        if settings.get_base_solr_url():
            onaccept = self.document_onaccept # where document_onaccept is the onaccept hook for indexing
            ondelete = self.document_ondelete # where document_onaccept is the onaccept hook for deleting
        else:
            onaccept = None
            ondelete = None

        configure(tablename,
                  onaccept=onaccept,
                  ondelete=ondelete,
        .....

In onaccept()

        vars = form.vars
        doc = vars.file # Where file is the name of the 

        table = current.db.doc_document # doc_document is the tablename
        try:
            name = table.file.retrieve(doc)[0]
        except TypeError:
            name = doc

        document = json.dumps(dict(filename=doc,
                                  name=name,
                                  id=vars.id,
                                  tablename="doc_document", # where "doc_document" is the name of the database table
                                  ))

        current.s3task.async("document_create_index",
                             args = [document])

        return

SMART Goal	Measure	Status
Explore Pylucene	Installed and configured on demo server	Completed
Scripts for indexing and search in pylucene	Scripts working on the demo server	Completed
Explore Apache Solr and Sunburnt	Installed both on demo and local server	Completed
Scripts for indexing and search for sunburnt	Working scripts for sunburnt ready	Completed
Asynchronously Indexing and Deleting Document	Implemented & Integrated in Sahana Eden	Completed
Install Script foe Installing and Configuring Solr and sunburnt	Below is the link of the script	Completed
Designing the Full-Text search functionality implementation	Discussed with Dominic and Fran	Completed
Implementation of fulltext() function in s3resource.py	Successfully implemented with Error handling	Completed
Implemented a transform() function to transform a text query to belong query	Successfully implemented with Error handling	Completed
Unit tests for all cases(solr un/available, query(), call() )	Implemented the unit tests for s3resource	Almost Done

Attachments (1)

install-solr-sunburnt.sh (2.3 KB ) - added by Vishrut Mehta 12 years ago.

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text