|Version 14 (modified by 9 years ago) ( diff ),|
Blueprint for an Upgrade to the Translation Functionality
This is beign actions in GSoC2012
Sahana Eden uses the translation features of web2py. For the application programmer this requires the use of the function T() when displaying strings which will be displayed on the screen and so will need to be translated. The core web2py translation functionality extracts the strings and stores them in a *.py file, unfortunately this file only has space for the original and the translated string. For the translator it is often helpful to have a comment that can explain the context of a particular phrase. For other features to be introduced as part of this work it would also be necessary to know the origin of the string, this would be the file and possibly also the line number. (Note the string may have more than one origin, i.e. it appears in the code in more than one place).
Currently the strings for translation are held in a *.py file in the languages directory. To test this you can create a new language (see 000_config.py deployment_settings.L10n.languages) create an empty file in the languages directory and then visit a few Sahana-Eden pages having selected the new language. Now go back to the code and see that the file has been populated with some new strings.
The extracted strings with location could be held in a temporary place different from the languages file. This could be a plain file or possibly a database. Extra data that would be useful to retain here would be:
- the location,
- file name
- line number
- any comment to the translator,
- translation status,
- date of extraction,
Round-trip translation UI
The journey from an untranslated string to a translated one can involve several different tools and can take different paths. The idea here is to integrate these tools (so this is not really about replacing them) into a single portal. One such route is as follows:
- Extract the strings from Eden
- build a spreadsheet that can be given to the translators
- convert the spreadsheet back to a *.py language file
- add back into Eden trunk
Then the code changes and a string needs to be modified. At this point the person managing the translation doesn't specifically know that a string is no longer required and so redundant strings hang around in the translated files (and some that may not yet have been translated which have since been superseded are still awaiting translation)
A second, and different, approach to translating the string is to use the Pootle tool, the integration of this would be nice but the key area is to be able to identify the strings for translation.
Once the strings for translation have been identified then the UI needs to provide a seamless mechanism to integrate them back into Sahana-Eden.
This is purely indicative and may change
- Being able to extract details concerning where a string is located within the code base
- Automate the translation round-trip (initial focus on the spreadsheet approach rather than pootle)
- Build a GUI front end to present the translation status
- Extract for a module or set of modules (ideally this will be generated without having to page through Eden web pages)
- Report on the translation status (percentage translated per module)
- Display and edit translation strings (optional but can be useful if translators have questions about a particular string and probably design here but implement later)
- Identify roles (translation manager, translator...)
- Allow comments to be added to the T() function and let translators indicate when comments would be useful. Possibly consider adding these dynamically to the code base
- Validate the translations and original strings for fitness (this will require some investigation on what this could entail one idea would be to ensure that %s variable substitutions are properly labeled.
- more to come...
Community bonding period
- Learning Goals : I will utilize this time to familiarize myself with the eden code structure, particularly the existing translation module. This will include:
- Understanding the parser library of python and how the current code uses the parse tree.
- Exploring the Pootle software and its features.
- Relevant modules of Translate Toolkit.
- Outstanding blueprint Questions : The scope of the project will be discussed, modified and understood by me during this period. Also, the design and implementation details will be discussed with the mentors to prepare for the Coding phase.
- Initial Tasks :
- Use the parser library to try and extract data from the code. Once I am comfortable with the parser, I can use it to extract the currently active template and modules from the corresponding files as mentioned earlier.
- Discuss and work on implementing the script that handles merge conflicts when merging translations from pull requests and pootle.
Mid Term Evaluation I plan to provide the following features by mid-term:
- Excluding deprecated strings
- Retrieving strings from active templates
- Including prepop csv files
- Provide option to select all templates in GUI.
Final Evaluation The final project will contain the following deliverables (apart from those mentioned above)
- Handling conflicts from pull requests
- Handling conflicts from Pootle
- Syncing Pootle with Web2py with respect to translation
- Removing external dependency on Translate Toolkit
- For the translation process see the chapter in the book on Localisation
- For the translation code see
- For translating to and from a spreadsheet see the Translate Toolkit, also mentioned in the Eden book. (versions are packaged for .deb and .rpm just search for translate-toolkit within your package manager)
- The survey application uses it's own translation so that the survey forms can be translated independent of the application. Check the code for this