Changes between Version 25 and Version 26 of BluePrint/TextSearch

04/20/13 17:44:03 (11 years ago)
Vishrut Mehta



  • BluePrint/TextSearch

    v25 v26  
    3838*    Proper understanding and the work model of S3Search(depricated) and S3Filter is required.
    40 *    Literature study of Apache Lucene and PyLucene. Getting familiar with '''PyLucene''' and deploy it into my local machine.
     40*    Literature study of Apache Lucene and Pylucene. Getting familiar with '''Pylucene''' and deploy it into my local machine.
    4242*    Studying the linkage of the Lucene daemon and web2py server.
    5050=== System Constraints ===
    52 *    The user should have PyLucene installed in there machine.
     52*    The user should have Pylucene installed in there machine.
    5454*    Also, while starting the web2py server, the Lucence deamon should also start.
    9191=== Wireframes ===
    9394=== Technologies ===
    110111Solr is a platform that uses the Lucene library, the only time it may be preferable to use Lucene is if you want to embed search functionality into your own application. So I choose Lucene for indexing the documents and search string in those documents.
     113Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
     115Refer this for more information about its functtionalities:
    112119== Implementation ==
     121*    It consists of extending the usage of S3Filter to document search by creating new TextFilter field in the document search form as well as all other resources.
     123*    When a user upload its document, it is indexed using the Lucene deamon, which will be running at background.
     125*    As and when new document is uploaded or edited, it will be indexed, so as to search in it efficiently. Lucene provides a library which does its indexing and stuff efficiently.
     127*    When a user enters a query, a request will be sent to the deamon and the deamon will search through the indexed documents and give the output search results.
     129*    There is also Full-text search over different resources, which would need the resources in which the user wants to search for.
     131*    This would be accomplished by using Pylucene, which is a wrapper on Apache Lucene in Python to carry out these tasks.
     133*    After the response, the part which remains will be displaying the search results in a proper user friendly format.
     135'''Future Implementation:'''
     137*    UI is a secondary concern for how to display the search result. We could take inspiration from the Google and Bing! Search results for an attractive UI format.
    114139== References ==