|Version 41 (modified by 10 years ago) ( diff ),|
Blueprint for an Upgrade to the Translation Functionality
The current translation functionality in Sahana-Eden does the following ( Most of these are in s3translate.py file) :-
- Provide a menu to select a list of modules from which strings are to be translated ( doesn’t default modules corresponding to active template)
- Extract strings from the selected modules using parse tree approach. Also extracts strings of deployment.settings variables (but not database variables)
- Strings can be exported in xls and po formats
- Merges uploaded translations ( in csv) with the existing .py language file ( doesn’t overwrite)
- Pootle translations are not synced currently.
- Doesn’t account for conflicts due to pulls and pomerge.
- External dependencies due to calls to methods in Translate Toolkit
This is purely indicative and may change
- Able to retrieve strings from currently active modules.
- Including Prepop Csv files
- Pootle Integration
- Excluding Deprecated Strings.
- Conflict in strings due to pull requests.
- Avoiding system calls in Translate Toolkit.
- Removal of Deprecated Strings : Size of “.py ” files will keep on increasing if the new strings are merged with the existing strings. As changes are made to code some strings become deprecated while some new strings are introduced. So we can run the code with all the modules selected periodically and replace the existing files. This will remove all the deprecated strings and new strings will be available for translation. This can be done by using the “-o” option in the existing translation module which will overwrite the existing “.py” files instead of merging with them. This can be made into a scheduler job which can run periodically or it can be manually triggered by the admin as and when appropriate.
- Retrieval of strings from currently active template : Currently, we don’t have an option to check which strings are present in the active template. This can be done as follows:
- Use the parse tree approach to parse out the currently active template from 000_config.py
- Next, we parse the eden/private/templates/<current-template>/config.py to get the active modules of that template
- So, only these modules will be checked by default (when showing the module selection page)
- Hence, we know which modules correspond to the current template and this can be used to extract only the relevant strings.
- Build a dependency system so that when a particular module is selected all the modules it depends on are also selected.
- Including database variables : We need to extract the strings in database variables so that they too can be translated. Currently, these variables are excluded from translation. Hence, one approach to extract these strings is as follows:
- Use the prepop csv files in privates/templates/<current-template> and mark them to be considered for translation.
- Use S3BulkImporter to get list of csv files from tasks.cfg file in the template folder.
- Use S3Represent to see which all csv files are to be considered for translation.
- Only name field is being considered for translation, think of a framework on how to select relevant fields from a csv.
- Pootle Integration : We need to make sure that the translation in pootle is kept in sync with that in the “.py” languages file. Below are few points to help us achieve this :
- As and when we use the overwrite option to remove deprecated strings,(as explained earlier) reflect these changes in pootle too. This will ensure that pootle doesn’t have any old strings and that new strings are also added.
- When merging from pootle, we might receive some conflicts ( just as through pull request). One possible solution is to create a script that identifies and stores all such conflicts in a file which can then be manually handled by translators.
- Also, an option for uploading ".po" files will be provided (apart from the current ".csv" files). The conflicts arising when merging this can be handled as mentioned before.
- Also, upload the updated .po file back to the server.
Hence, the translations in pootle and web2py will be consistent.
- Version Control : There can be a scenario where the translated strings received through pull request conflict with what’s already in the repository. Hence, we have to prevent this merge conflict. This can be handled in the same way as for Pootle (manual intervention).
- Avoiding External Dependencies : Current code makes system calls to csv2po and po2web2py but we want to remove these external dependencies. The link below gives a good reason why we should avoid this.
Community bonding period
- Learning Goals : I will utilize this time to familiarize myself with the eden code structure, particularly the existing translation module. This will include:
- Understanding the parser library of python and how the current code uses the parse tree.
- Exploring the Pootle software and its features.
- Relevant modules of Translate Toolkit.
- Outstanding blueprint Questions : The scope of the project will be discussed, modified and understood by me during this period. Also, the design and implementation details will be discussed with the mentors to prepare for the Coding phase.
- Initial Tasks :
- Use the parser library to try and extract data from the code. Once I am comfortable with the parser, I can use it to extract the currently active template and modules from the corresponding files as mentioned earlier.
- Discuss and work on implementing the script that handles merge conflicts when merging translations from pull requests and pootle.
Mid Term Evaluation I plan to provide the following features by mid-term:
- Excluding deprecated strings
- Retrieving strings from active templates
- Including prepop csv files
- Provide option to select all templates in GUI.
Final Evaluation The final project will contain the following deliverables (apart from those mentioned above)
- Handling conflicts from pull requests
- Handling conflicts from Pootle
- Syncing Pootle with Web2py with respect to translation
- Removing external dependency on Translate Toolkit
- Initially we merge the translations between pootle and py such that translations in py are given preference. This merged set of strings is reflected in both pootle and py. This creates an initial sync between the two.
- Now when someone requests for an Excel file for translation, we check pootle for any updated strings since last sync. If any, we merge them to py files and provide the translator with an updated set of strings in the Excel file. Hence this way py is kept in sync with pootle regularly.
- After a csv file is submitted to Eden after translations, we update both py and pootle with the changes. (Anything changed in csv implies earlier translation from pootle has to be overwritten).
- Note : Strings are never removed from pootle, even if they are removed from the py files.
- Note : Strings missing in pootle are added to pootle.
- Implementation Details:
- Function merge_pootle takes 2 arguments (language_code, preference). Preference defines which is given preference in case of a conflict (either po or py).
- Function download_strings takes the language_code and returns pootle strings and python strings in a tuple.
- Function merge_strings takes 2 set of strings and merges them on the basis of preference. If preference is given to python, strings from python language files are added to pootle.
- Function upload_to_pootle takes language_code, a pootle file and uploads the pootle file to Sahana server on pootle using overwrite option. Username and password are obtained from the configuration settings. (L10n settings). Mechanize library of python has been used to implement this.
Dependency System :
- Currently only one dependency has been set. Inv module depends on Supply module.
- Depends field can be extracted from current.models, for inv it is current.models.inv.depends
- For setting dependencies:
- Go to s3db/module
- Add a depends list the way it has been done for inv module.
Translation of csv files
- To consider a csv file for translation set its translate field in its s3represent to true.
- It has been done for org_sector
- in s3db/org.py ( represent = S3Represent(lookup=tablename, translate=True) )
- Similarly fields to be considered can also be set in the represent.
- For the translation process see the chapter in the book on Localisation
- For the translation code see
- For translating to and from a spreadsheet see the Translate Toolkit, also mentioned in the Eden book. (versions are packaged for .deb and .rpm just search for translate-toolkit within your package manager)
- The survey application uses it's own translation so that the survey forms can be translated independent of the application. Check the code for this