Version 6 (modified by 14 years ago) ( diff ) | ,
---|
Sahana Eden Optical Character Recognition
Sahana OCR Module is responsible to generate OCR-able PDF Form for every resource Form available in HTML. There are two types of interfaces available based on the usecase.
- Single Form Upload When there is human resource available to upload and verify the OCR-ed data individually then this interface can be considered. Here the user will upload a scanned Form, let eden OCR it, then he/she can verifies the data and update the database with this new record.
- Bulk Form Upload When there is not enough human resource available to upload and verify the OCR-ed data then this usecase can be considered where scanned forms can be uploaded in bulk and verified later.
Phases
The workflow of OCR Module is composed of two phases:
- Download PDF Form
- Upload Scanned Form
Download PDF Form
A PDF Form for a resource can be downloaded from the buttons available in create/update UI of any resource alternatively it can be downloaded directly from the following links.
http://127.0.0.1:8000/eden/modulePrefix/moduleSuffix/create.pdf
and
http://127.0.0.1:8000/eden/modulePrefix/moduleSuffix/recordId/componentName/create.pdf
Upload Scanned Form
A Scanned form can be uploaded in two different ways.
1. Individual Upload
For this use-case, the user has to use the web UI to upload the scanned OCR Form. The scanned Form can either be a single PDF file or multiple image files, one file corresponds to one page. The OCR upload button is available in create/update UI of any resource alternatively it can be uploaded directly from the following links.
http://127.0.0.1:8000/eden/modulePrefix/moduleSuffix/upload.pdf
and
http://127.0.0.1:8000/eden/modulePrefix/moduleSuffix/recordId/componentName/upload.pdf
While there are two options available for uploading a OCR Form.
- one image per page
- upload a Scanned OCR form as a PDF.
2. Bulk Upload
TODO