Version 7 (modified by 15 years ago) ( diff ) | ,
---|
Blueprint for Optical Character Recognition
Be able to scan in a paper-based form to populate the database
- http://wiki.sahana.lk/doku.php/sahanaocr
- http://wiki.sahana.lk/doku.php?id=dev:sahana_xform
- http://humanitariantech.com/2009/11/16/talking-papers-a-world-without-data-entry/
The C++ code written for SahanaPHP (during GSoC 2007) can almost-certainly be tweaked to work with SahanaPy:
This version uses OpenCV & FANN
A Firefox add-on to enable a nice workflow for users is being developed for SahanaPHP as part of GSoC 2009:
This will access the Scanner (e.g. using TWAIN or SANE) and read the Image. The acquired image will be passed to the OCR library & the result will be posted into the web form.
Again, this should be easy to tweak to get working with Py.
Possibility of using pytesser ( http://code.google.com/p/pytesser/ ) with cross platform tesseract-ocr ( http://code.google.com/p/tesseract-ocr/ )
Plone uses Tesseract: http://plone.org/documentation/tutorial/ocr-in-plone-using-tesseract-ocr
Attachments (1)
- xforms_ocr.png (25.0 KB ) - added by 14 years ago.
Download all attachments as: .zip