Version 13 (modified by nownikhil, 11 years ago) ( diff )


Blueprint for an Upgrade to the Translation Functionality

This is beign actions in GSoC2012

Current situation

Sahana Eden uses the translation features of web2py. For the application programmer this requires the use of the function T() when displaying strings which will be displayed on the screen and so will need to be translated. The core web2py translation functionality extracts the strings and stores them in a *.py file, unfortunately this file only has space for the original and the translated string. For the translator it is often helpful to have a comment that can explain the context of a particular phrase. For other features to be introduced as part of this work it would also be necessary to know the origin of the string, this would be the file and possibly also the line number. (Note the string may have more than one origin, i.e. it appears in the code in more than one place).

Currently the strings for translation are held in a *.py file in the languages directory. To test this you can create a new language (see deployment_settings.L10n.languages) create an empty file in the languages directory and then visit a few Sahana-Eden pages having selected the new language. Now go back to the code and see that the file has been populated with some new strings.

The extracted strings with location could be held in a temporary place different from the languages file. This could be a plain file or possibly a database. Extra data that would be useful to retain here would be:

  • the location,
    • file name
    • line number
  • any comment to the translator,
  • translation status,
  • date of extraction,

Changes to the core translation functionality

The standard web2py function should be extended to support an optional comment, ideally this would be done with the call to T() but an alternative solution may be acceptable. Note what we don't want to have to do unless absolutely necessary (and even then we don't want to do it) is to have to change either the name of the function or the required parameters, because that would require numerous changes within the existing code base. So the comment could be an optional parameter but the function must use reflection techniques to work out it's own location.

When an instance of Sahana-Eden is deployed it may require translation, if this is during an emergency there may not be a lot of time and so only a select subset of strings need to be translated. This subset might be the front facing part of the deployment or it may include just the modules that have been deployed. However, when the files for translation are created there is no link back to the code and so it is difficult to know where the string came from. What is required is for the function to extract the location of the string this can then be used to determine which module the string belongs to. This functionality may have several iterations but with the final goal of being able to download strings for translation that belong to various modules, this might involve a UI that allows the translation administrator to select which modules (or parts) of the system they want to translate.

Round-trip translation UI

The journey from an untranslated string to a translated one can involve several different tools and can take different paths. The idea here is to integrate these tools (so this is not really about replacing them) into a single portal. One such route is as follows:

  • Extract the strings from Eden
  • build a spreadsheet that can be given to the translators
  • convert the spreadsheet back to a *.py language file
  • add back into Eden trunk

Then the code changes and a string needs to be modified. At this point the person managing the translation doesn't specifically know that a string is no longer required and so redundant strings hang around in the translated files (and some that may not yet have been translated which have since been superseded are still awaiting translation)

A second, and different, approach to translating the string is to use the Pootle tool, the integration of this would be nice but the key area is to be able to identify the strings for translation.

Once the strings for translation have been identified then the UI needs to provide a seamless mechanism to integrate them back into Sahana-Eden.


This is purely indicative and may change

  • Being able to extract details concerning where a string is located within the code base
  • Automate the translation round-trip (initial focus on the spreadsheet approach rather than pootle)
  • Build a GUI front end to present the translation status
    • Extract for a module or set of modules (ideally this will be generated without having to page through Eden web pages)
    • Report on the translation status (percentage translated per module)
    • Display and edit translation strings (optional but can be useful if translators have questions about a particular string and probably design here but implement later)
    • Identify roles (translation manager, translator...)
  • Allow comments to be added to the T() function and let translators indicate when comments would be useful. Possibly consider adding these dynamically to the code base
  • Validate the translations and original strings for fitness (this will require some investigation on what this could entail one idea would be to ensure that %s variable substitutions are properly labeled.
  • more to come...


Community bonding period

  • Learning Goals : I will utilize this time to familiarize myself with the eden code structure, particularly the existing translation module. This will include:
    • Understanding the parser library of python and how the current code uses the parse tree.
    • Exploring the Pootle software and its features.
    • Relevant modules of Translate Toolkit.
  • Outstanding blueprint Questions : The scope of the project will be discussed, modified and understood by me during this period. Also, the design and implementation details will be discussed with the mentors to prepare for the Coding phase.
  • Initial Tasks :
    • Use the parser library to try and extract data from the code. Once I am comfortable with the parser, I can use it to extract the currently active template and modules from the corresponding files as mentioned earlier.
    • Discuss and work on implementing the script that handles merge conflicts when merging translations from pull requests and pootle.

Mid Term Evaluation I plan to provide the following features by mid-term:

  • Excluding deprecated strings
  • Retrieving strings from active templates
  • Including prepop csv files
  • Provide option to select all templates in GUI.

Final Evaluation The final project will contain the following deliverables (apart from those mentioned above)

  • Handling conflicts from pull requests
  • Handling conflicts from Pootle
  • Syncing Pootle with Web2py with respect to translation
  • Removing external dependency on Translate Toolkit

Technical help

  • For the translation process see the chapter in the book on Localisation
  • For the translation code see web2py/gluon/
  • For translating to and from a spreadsheet see the Translate Toolkit, also mentioned in the Eden book. (versions are packaged for .deb and .rpm just search for translate-toolkit within your package manager)
  • The survey application uses it's own translation so that the survey forms can be translated independent of the application. Check the code for this eden/applications/modules/s3db/

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.