Internationalizing Messages in Linux Programs

An introduction to the GNU gettext system for producing multilingual programs.

Linux is becoming increasingly popular each day. Until now, the typical Linux user has been a system administrator, student or UNIX hacker. New projects such as GNOME, KDE and GNUStep are preparing the way for a different, less technically prepared user.

Running software in English is usually not a problem for someone with at least moderate computer skills, but end users need (and want) software that speaks their own language in order to be productive or feel comfortable with the system. Moreover, many programs need to know local conventions for things such as dates or money amounts in order to be useful and complete.

This article is an introduction to the GNU gettext system, a set of tools and libraries for both programmers and translators that enables them to produce multilingual programs with textual messages in specified languages. We will deal with languages that use one of the ISO-8859-X character sets, except for Japanese and Chinese as they require extra care.

Definitions

Two words appear frequently when talking about support of different languages in programs: internationalization and localization. Since writing these words over and over (without spelling errors) is annoying and time-consuming, people abbreviate them as I18N and L10N. The 18 and 10 indicate the number of letters between the first and the last letter of each word.

Internationalizing a program means taking the necessary steps to make it aware of different languages and national standards.

The process of localization takes place when an internationalized program is given the information needed to behave correctly with a certain language and set of cultural habits.

First Things First

The first thing to do, for both programmers and end users, is configure the Linux machine to use locales. Most users need only follow the Locales mini-HOWTO downloadable from ftp://sunsite.unc.edu/pub/Linux/docs/ and mirrors. Recent distributions (for example, Red Hat 5.0) include everything to support locales.

Once the system is enabled to support locales, you must specify the particular standards and languages you wish to use. This is done through a set of environment variables. Each one controls a specific aspect of the locale system:

  • LANG specifies the global locale, but can be overridden by the following variables.

  • LC_COLLATE specifies the locale used for sorting and comparing.

  • LC_CTYPE specifies the character set in use, so that isupper('<\#192>') returns true in an Italian locale.

  • LC_MONETARY provides information about representing money in a specific locale.

  • LC_NUMERIC gives information about numbers: how digits are divided and separated in groups, what the decimal point is, etc.

  • LC_TIME specifies which locale to use to represent time: AM/PM or 24-hour values, for example.

  • LC_MESSAGES indicates the language you prefer for programs' text messages.

  • LC_ALL overrides any previous indication and sets a global locale.

Examples of values for global locale are:

  • en_US indicates English in the United States.

  • it_IT is for Italian in Italy.

  • fr_CA is for French in Canada.

Basically, to use the standards of language LL in country CC, the locale value will be LL_CC.

Listing 1.

The locale used by default, unless overridden by the previous variables, is called the C (or POSIX) locale. Thus, it is very easy to illustrate the behavior of a locale-aware program by using date, for example (see Listing 1). First, without setting the LC_ALL variable, the response is in English. Next, LC_ALL is set to obtain an Italian response, a French one (French in Canada is specified), then an English one (English in Canada). The “No such file or directory” for the Italian locale is not translated, which means the Italian information is not available; therefore, the default is used instead.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState