wiki:BluePrintTransliteration

Version 27 (modified by Samsruti Dash, 8 years ago) ( diff )

--

BLUEPRINT TO ADD TRANSLITERATION TO TEXT ENTRY CONTROLS

Aim:

Sahana Eden Software is used in almost all the countries . Transliteration is the conversion of a text from one script to another. For local language text entry one can use a native language keyboard with native Unicode character set or use transliteration to allow entering the native word. E.g. of Transliteration : “Google Transliteration Beta”

Where they are Used:

These are mainly used in CAP Broker GUI, CAP templates.A CAP Broker is always available with messages in different local languages. Users of Sri Lanka would require a CAP broker that a message carries the <cap.alert.info> section in Sinhala or Tamil languages .The person who will create the messages require to type the <cap.alert.info.description> for both the languages . For Example, a term “Earthquake” which is pronounced as ûrth'kwāk'), when it will be typed in the message , the output result for Sinhalese people will be " භූකම්පනය " and for Tamils will be “நிலநடுக்கம்”

Technical side of Using This Feature

Google offered “Google AJAX Transliteration API” . To use that API an internet connection is required for online Transliteration .By Refering to this link : https://developers.google.com/transliterate/v1/getting_started#usingApis , anybody can get started with the API. Google is also providing offline services.But this is deprecated since 2011. But a new Transliteration API is introduced by Bing named Bing Translation APIs. A developer can easily use it by reffering to this link : http://www.microsoft.com/web/post/using-the-free-bing-translation-apis

How To use it in Eden ?

If a developer is decided to develop transliteration input service, then he/she should check the S3Widget.py file , which will make him easy to edit the existing structure of Eden. A widget named as as TransliterationTextarea in the model is the main part that we need to change to activate transliteration for a text-value. The widget should be use AJAX request or Bootstrap request . The main problem arises with the source of data .Even if the Eden website is given permission to use their data source , we still need to convert those into XML or JSON. We are not sure about the data , they may be correct partially.So we need to test it. The work needs to be updates. If any one the transliteration service is updates then others will become useless. User Interface is the main point of transliteration.We will require a text area for the input having autocomplete common function in Visual Studio or Adobe Dreamweaver. A dropdown box should be there with possible meanings a letter combination can have. By using jQueryUI ListBuilder we can overcome this problem which require UI.

SUGGESTION

According to the points explained above, a transliteration engine is quite a lot of work, as we have no consistent database of roman -> Indian, Chinese, or Cyrillic characters which we could use. As long as we are not able to find one, an implementation of a transliteration engine is not recommended. If we could find a professional and consistent database we could start to implement the feature using the JQuery plugin explained above.

List of Other transliteration services

There are a few alternative transliteration services. They have no adoptable GUI, a few have APIs, which, again, require a permanent Internet connection.

All The API are shareware.

Mygengo Translation APIhttp://mygengo.com/api/Online
Microsoft Translator APIshttp://www.microsofttranslator.com/dev/Online
Speaklite Translate APIhttp://www.speaklike.com/access-professional-translation-via-apiOnline
WebServiceX Translate APIhttp://www.webservicex.net/ws/wsdetails.aspx?wsid=63Online

Description on Different APIs

''Mygengo Translation API''

The Gengo API helps you to take your service global. Whether you’re integrating an option for translation for your users, or building a fun translation application, our API handles a variety of languages and features.

The online documentation here describes the resources comprising the official Gengo API. You can jump right in by browsing the resources on the right .

''Microsoft Translator APIs''

Microsoft Translator provides a powerful set of web service APIs that developers can use to take advantage of its best-of-breed Machine Translation technology in their own applications, services or web sites. This API may be called in a number of ways, including an HTTP REST Service, an AJAX-callable service and a SOAP Web Service.

The Microsoft Translator API, an online service, is available directly from the Windows Azure Marketplace. The Microsoft Translator API is sold as a monthly subscription based on the number of characters of text passed to the API. The API is available for FREE for usage up to 2 million characters per month.

''Speaklite Translate API''

When you integrate SpeakLike into your content management system, support software or website, cross-language communication becomes another behind-the-scenes tool. You’re free to focus on marketing, customer interaction and business communication instead of managing translation projects.

Their API allows your IT or development team full access to our translation platform. They can build whatever user interface works best for you and we do the rest. Translation requests are automatically processed and made available to your system by either push or pull methods, according to your preferences.

''WebServiceX Translate API''

Convert text from one lanaguage to another language .Supported languages are English to Chinese,English to French,English to German,English to Italian,English to Japanese,English to Korean,English to Portuguese,English to Spanish,Chinese to English,French to English,French to German,German to English,German to French,Italian to English,Japanese to English,Korean to English,Portuguese to English,Russian to English,Spanish to English.

Transliteration Input Method

Google Transliteration IME is an input method editor which allows users to enter text in one of the supported languages using a roman keyboard. Users can type a word the way it sounds using Latin characters and Google Transliteration IME will convert the word to its native script. Note that this is not the same as translation -- it is the sound of the words that is converted from one alphabet to the other, not their meaning. Converted content will always be in Unicode.

http://3.bp.blogspot.com/_C8NLLuO4CdQ/S1VUnCIT1ZI/AAAAAAAAESo/j9iqcDc0QSE/s400/IME_edit_window.png

Google Transliteration IME offers several features focused on an improved user experience, including offline support, word completion, personalized choices, easy-to-use keyboard, quick search and several cool customization options. As a user types, a suggestion menu is displayed with alternatives and word completions. For example, as you type "googl" you will see five options from you can select the correct one.

A developer can download the Google Transliteration IME. Google Transliteration IME is currently available for 19 different languages - Amharic, Arabic, Bengali, Farsi (Persian), Greek, Gujarati,Hindi, Kannada, Malayalam, Marathi, Nepali, Punjabi, Russian, Sanskrit, Serbian, Tamil, Telugu, Tigrinya and Urdu.

Thoughts about server-side implementation

Another idea that came up is, that it would be nice if a user could search for something by using his or her language specific letters. For example, a user wants to search for "Aspirin" in Arabic letters. Currently, he won't find anything, as the UTF-8 encodings of roman letters are different from the Arabic letters ("Aspirin" == "أسبرين" would fail). The idea is to reverse transliterate the non-roman string into roman letters on server-side and perform the search with the reverse-transliterated string. That would require modification of the search-engine. The transliteration database could be the same as for the client-side (again, XML or JSON is recommended).

Note: See TracWiki for help on using the wiki.