== Sahana Eden OCR Integration == The Optical Character Recognition Software of Sahana Eden has some additional dependencies as well as can be configured according to the needs. If OCR module is not enabled, it can be enabled by un-commenting the ocr block in `models/000_config.py` in `eden` directory. == Dependencies == '''python modules''' 1. python-lxml 2. python-imaging (PIL) 3. python-reportlab '''command-line tools''' 4. Imagemagick 'convert' 5. Tesseract 3.00-1 {{{ apt-get install -y imagemagick # Old versions: #apt-get install -y libleptonica-dev tesseract-ocr wget http://www.leptonica.com/source/leptonica-1.69.tar.gz tar zxvf leptonica-1.69.tar.gz cd leptonica-1.69 ./configure make make install cd .. wget http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz tar zxvf tesseract-ocr-3.02.02.tar.gz cd tesseract-ocr ./configure make make install cd .. }}} == Configuration == '''Exclude Component Tables''' Each Resource table in Sahana Eden can have several component tables. Many a times when generating paper based PDF Form for including some components makes a little sense. For example, for hospital registry Form, if the staff component table is included then it makes very little sense because no one would like to add single staff to a hospital and therefore he/she would like to exclude that component and have the Form associated to component table separately. This exclusion of component table for Resource can be done inside method `get_pdf_excluded_fields` which is present in `modules/s3/s3cfg.py`, so before generating a PDF Form s3pdf.py reads this configuration. Example Configuration: {{{ def get_pdf_excluded_fields(self, resourcename): excluded_fields_dict = { "hms_hospital" : [ "hrm_human_resource", ], "pr_group" : [ "pr_group_membership", ], } excluded_fields =\ excluded_fields_dict.get(resourcename, []) return excluded_fields }}} In the above configuration, we have excluded `hrm_human_resource` component of `hms_hospital` and `pr_group_membership` component of `pr_group` == Workflow Diagrams == '''Generating PDF Forms''' [[Image(http://eden.sahanafoundation.org/raw-attachment/wiki/BluePrint/OCRIntegration/generated.png)]] '''Data import from image to Text''' [[Image(http://eden.sahanafoundation.org/raw-attachment/wiki/BluePrint/OCRIntegration/importflow.png)]] '''Review User Interface''' [[Image(http://eden.sahanafoundation.org/raw-attachment/wiki/BluePrint/OCRIntegration/reviewUI.png)]]