Introduction to Internationalization Programming
Typically, a programmer works on the source code, and a translator deals with the corresponding .po file, which may be created with the copy command from the .pot file or directly from the source with the xgettext program.
Glance over a .po file, and you will see it has a header and entries for translating. Here is an example of an entry:
# This is my own commentary #: counter.c:25 #, c-format msgid "You typed %d %s \n" msgstr "Vous avez tapé %d %s\n"
It's simple. You translate phrases from msgid and put the results into the msgstr fields. If a line starts with #:, it is a reference to the source; if it starts with #, it shows an entry's attributes. You can add your own comments in lines starting with two symbols: # (the pound sign and then a white space).
After copying the template .pot file information (.po) or creating a new one with the xgettext command, you can start translating. Generally, this job can be done by another person using his or her favorite editor. (X)Emacs is not a bad choice for this job, but KBabel, part of KDE, is an even better one. If you are going to participate in a team of translators, it is highly recommended that you use KBabel. Describing KBabel is beyond the scope of this article, but you can read more about it in “The KBabel Handbook” (see Resources).
Translation is a kind of art. Writing a correct phrase can be difficult, and you sometimes may doubt your ability with a particular language. So, you may want to leave some entries untranslated, or having translated them doubtfully, mark them as “fuzzy”. With KBabel or (X)Emacs, you easily can find such entries and edit them again later. Do not worry; only fully translated entries will be compiled later by msgfmt and become usable in programs. This simply means that an entry may be marked “translated”, “untranslated” or “fuzzy”, and as software changes quickly, there is also an “obsolete” attribute.
Languages are flexible. English messages are not always perfect either. In our case, the message “You typed 0 digit” is incorrect. GNU gettext can manage translating problems like word order, plural forms and ambiguities, but you have to use extra functions that hold more arguments than gettext().
Once you have translated the file, you should convert it into a .mo file that gettext will use if you run the program with the corresponding locale. Do not forget to put this file in the right place, in our case:
mv counter.mo fr/LCMESSAGES/
Now the counter can speak French! (See Listing 1.)
Programs evolve, and if their source code is changed, the corresponding .po files also have to be updated. Using only xgettext in this case is not an ideal solution. All translated messages will be lost, because it overwrites .po files. In this case, you should use the program msgmerge. This program merges two .po files, keeps translations already made (if the new strings match with the old), updates entries' attributes and adds new strings. Of course, these new strings will be untranslated entries. A typical call is simple:
msgmerge old.po new.po > up-to-date.po
In this article, the input method is not described, although it is also important. Generally, non-X11 software doesn't need to worry about i18n input methods, because it is the responsibility of the console and X terminal emulators.
For input in X11, three methods exist: Xsi, Ximp and XIM. The first two are old-fashioned; the last one is the de facto standard. Their descriptions are beyond the scope of this article; however, the source code for the rxvt program provides an excellent example.
Modern tools provide their own special subroutines for input of internationalized strings, using gettext for output messages. To make program code 8-bit transparent for internal proposals, Unicode is used. Qt, for instance, works in such a way, providing additional functions for input and output of i18n strings correctly (see Resources).
You also may want to look at the source code of mutt, which is a good i18n program (www.mutt.org). This program supports aliases for charsets.
Using Unicode in your programs is described by Tomohiro Kubota (see Resources). Happy i18ning!
|Bitcoin on Amazon! Sort of...||Sep 28, 2016|
|Free Today: September Issue of Linux Journal (Retail value: $5.99)||Sep 27, 2016|
|nginx||Sep 27, 2016|
|Epiq Solutions' Sidekiq M.2||Sep 26, 2016|
|Nativ Disc||Sep 23, 2016|
|Android Browser Security--What You Haven't Been Told||Sep 22, 2016|
- Free Today: September Issue of Linux Journal (Retail value: $5.99)
- Bitcoin on Amazon! Sort of...
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Android Browser Security--What You Haven't Been Told
- Nativ Disc
- Epiq Solutions' Sidekiq M.2
- Tech Tip: Really Simple HTTP Server with Python
- Identity: Our Last Stand
- Securing the Programmer
Pick up any e-commerce web or mobile app today, and you’ll be holding a mashup of interconnected applications and services from a variety of different providers. For instance, when you connect to Amazon’s e-commerce app, cookies, tags and pixels that are monitored by solutions like Exact Target, BazaarVoice, Bing, Shopzilla, Liveramp and Google Tag Manager track every action you take. You’re presented with special offers and coupons based on your viewing and buying patterns. If you find something you want for your birthday, a third party manages your wish list, which you can share through multiple social- media outlets or email to a friend. When you select something to buy, you find yourself presented with similar items as kind suggestions. And when you finally check out, you’re offered the ability to pay with promo codes, gifts cards, PayPal or a variety of credit cards.Get the Guide