Event/2012/GSoC/Translation – SahanaEden

wiki:Event/2012/GSoC/Translation

Context Navigation

Version 23 (modified by vivek_h, 13 years ago) ( diff )
--

Upgrade Translation Functionality: GSoC Project 2012

BluePrint reference link

GitHub link

Weekly Meeting Schedule : Saturday, 9:00 GMT

Personal Details

Student

Name : Vivek Hamirwasia
Country : India
Timezone: GMT + 5:30
Email : vivsmart[at]gmail[dot]com
IRC Nick: vivek_h

Mentor

Name : Graeme Foster
Email : foster[dot]graeme[at]gmail[dot]com
IRC Nick : graemef

Project Abstract

Sahana-eden currently uses web2py translating feature to translate sahana-eden to different languages. With the current system only the original strings and the translated strings are available to the translators. That only would not be enough to translate correctly with the proper meaning. The objective of the project is to improve the translation process so that the translators have more information such as file name, line number, comments for the translators etc. So for example the translators will know the module in which the strings are in and that will help to translate more appropriately. Further the T() function currently used to identify strings to be translated will be improved so that the developer can add a comment for the translators. And a GUI (a web page) embedded into eden will be implemented to translate on the fly and see the progress of the translations.

Project Plan

Below is the detailed project plan for the Upgrade Translation Functionality project :-

Project Deliverables :- The idea of the project is to automate the entire process of selective translation by providing a tool that helps the translators to translate only the relevant strings in the code when it is deployed. Also, when the code is changed, some strings may no longer be required for translation and some new strings might be added, this tool accounts for such changes for consistent translation. A GUI will be developed to present the translation status for each active module. Addition of comments to the T(...) function must also be facilitated. Finally, the tool will ensure that the translations made by the translators is integrated back into the main code of Eden.

1) Retrieval and storage of all strings :- Initially, a python script would be run to collect all the strings in the Eden system. This collection of strings will be stored in a separate file (different from the languages files) such that each row would contain the original (untranslated) string, its location(pathname/line number), comments and a flag to indicate if it has been translated or not (initially all flags will be unset). Let us refer to this file as all_strings.txt. Note that the same string appearing in two different files will be placed in two different rows. Also, the strings will be stored sorted first on their location and then on the original string value. Currently, we are focusing on the python parse tree generated by using the python parser library to get the strings from the ".py" files.

2) Building a spreadsheet for translators :- A python function will be run to check the currently active modules. The active modules will be passed as parameters to the function and later on it can be taken as input from the developer using GUI checkboxes. For further enhancement, the modules can be checked using the deployment settings in 000_config.py. Then the strings in the all_strings.txt will be matched for these modules using binary search (as the strings are sorted by location) and those strings whose flag is not set will be selected. Note that all strings belonging to the core code (i.e the part of code that is always used) will be considered by default. Also, duplicate strings would be removed before being passed to the translator. This spreadsheet would then be available to translators for translating (along with location and comments for each string, if any).

3) Converting back the spreadsheet :- Once the translations are made, all translated strings will be added back to the corresponding *.py and their flag will be set in the all_strings.txt . Also, in this step we validate the translations and original strings by checking that %s variable substitutions are labelled correctly.

4) Updating strings due to modification of code :- There might be several changes made to the code from time to time and so we need to update the all_strings.txt accordingly. The frequency of update can be set manually. We need to consider two cases - when new strings are added and when some existing strings are deleted. Hence while updating we run step 1) as mentioned above. Then use the required *.py file and for each string in *.py we check for all its occurences in all_strings.txt. If none are found, then we remove the corresponding entry for *.py otherwise we set the flag corresponding to those strings(to indicate these are already translated). The above procedure ensures that those strings already translated earlier, are not selected again for translation. Hence, this completes the updation of strings and takes care of any modification of code.

5) Allow comments : We want to have comments as an optional parameter to the T(...) function such that it becomes T(<string> , <comments>). Hence we could create a new T(...) function and over-ride the inbuilt web2py T(...) function. This new T(...) function would contain most of the code from the inbuilt function except that it would allow to pass comments as parameter.

6) GUI for tracking status: The status of translations for each module must be available on a UI. This can be done by periodically reading the spreadsheet and checking for the percentage of untranslated strings.

Project Goals and Timeline

Due Date	SMART goal	Measure	Status
First trimester (24 April - 20 May)
17 May	Work on retreival of all the relevant strings inside T(...) from the ".py" files and store the result in a file with complete location(file name and line number information).	The required strings are correctly recieved when tested on eden python files	Completed
Second trimester (21 May - 9 July)
28 May	Identify the categorization of modules in Eden by studying the file structure and dependencies	Retreived strings can be appropriately assigned module(s).	Completed
4 June	Group the retrieved strings by modules and select the strings in those modules which are currently active	The relevant strings are selected and displayed	Completed
11 June	Testing the code using unit tests and proper documentation	The code passes all tests and comments are provided to explain the code	In Progress
13 June	Using translate-toolkit to study the steps involved in converting language files from web2py -> po -> csv format	corresponging .csv file is formed	Planned
17 June	Converting the strings retreived directly into spreadsheets to be presented to the translator using python xlwt library	The spreadsheet formed is in the same format as that formed by translate-toolkit in the step above	Planned
25 June
8 July
Third trimester (10 July - 13 August)
15 July
25 July
3 August
13 August

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text