|Version 5 (modified by 13 years ago) ( diff ),|
Integrating/Developing a framework to extract structured data from web sources with a simple query language.
Some links that might be useful:
- PDFminer is a tool to convert pdf docs into text, it is open source (Licence). Some hacking in the souce code will is a good option for coding IMPORTING TOOL Spreadsheet Importer
The Spreadsheet Importer will be a component of this.