Introduction to Internationalization Programming
Typically, a programmer works on the source code, and a translator deals with the corresponding .po file, which may be created with the copy command from the .pot file or directly from the source with the xgettext program.
Glance over a .po file, and you will see it has a header and entries for translating. Here is an example of an entry:
# This is my own commentary #: counter.c:25 #, c-format msgid "You typed %d %s \n" msgstr "Vous avez tapé %d %s\n"
It's simple. You translate phrases from msgid and put the results into the msgstr fields. If a line starts with #:, it is a reference to the source; if it starts with #, it shows an entry's attributes. You can add your own comments in lines starting with two symbols: # (the pound sign and then a white space).
After copying the template .pot file information (.po) or creating a new one with the xgettext command, you can start translating. Generally, this job can be done by another person using his or her favorite editor. (X)Emacs is not a bad choice for this job, but KBabel, part of KDE, is an even better one. If you are going to participate in a team of translators, it is highly recommended that you use KBabel. Describing KBabel is beyond the scope of this article, but you can read more about it in “The KBabel Handbook” (see Resources).
Translation is a kind of art. Writing a correct phrase can be difficult, and you sometimes may doubt your ability with a particular language. So, you may want to leave some entries untranslated, or having translated them doubtfully, mark them as “fuzzy”. With KBabel or (X)Emacs, you easily can find such entries and edit them again later. Do not worry; only fully translated entries will be compiled later by msgfmt and become usable in programs. This simply means that an entry may be marked “translated”, “untranslated” or “fuzzy”, and as software changes quickly, there is also an “obsolete” attribute.
Languages are flexible. English messages are not always perfect either. In our case, the message “You typed 0 digit” is incorrect. GNU gettext can manage translating problems like word order, plural forms and ambiguities, but you have to use extra functions that hold more arguments than gettext().
Once you have translated the file, you should convert it into a .mo file that gettext will use if you run the program with the corresponding locale. Do not forget to put this file in the right place, in our case:
mv counter.mo fr/LCMESSAGES/
Now the counter can speak French! (See Listing 1.)
Programs evolve, and if their source code is changed, the corresponding .po files also have to be updated. Using only xgettext in this case is not an ideal solution. All translated messages will be lost, because it overwrites .po files. In this case, you should use the program msgmerge. This program merges two .po files, keeps translations already made (if the new strings match with the old), updates entries' attributes and adds new strings. Of course, these new strings will be untranslated entries. A typical call is simple:
msgmerge old.po new.po > up-to-date.po
In this article, the input method is not described, although it is also important. Generally, non-X11 software doesn't need to worry about i18n input methods, because it is the responsibility of the console and X terminal emulators.
For input in X11, three methods exist: Xsi, Ximp and XIM. The first two are old-fashioned; the last one is the de facto standard. Their descriptions are beyond the scope of this article; however, the source code for the rxvt program provides an excellent example.
Modern tools provide their own special subroutines for input of internationalized strings, using gettext for output messages. To make program code 8-bit transparent for internal proposals, Unicode is used. Qt, for instance, works in such a way, providing additional functions for input and output of i18n strings correctly (see Resources).
You also may want to look at the source code of mutt, which is a good i18n program (www.mutt.org). This program supports aliases for charsets.
Using Unicode in your programs is described by Tomohiro Kubota (see Resources). Happy i18ning!
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Sony Settles in Linux Battle
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Profiles and RC Files
- Maru OS Brings Debian to Your Phone
- Snappy Moves to New Platforms
- Understanding Ceph and Its Place in the Market
- What's Our Next Fight?
- Git 2.9 Released
- The Giant Zero, Part 0.x
- Astronomy for KDE
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide