Polyglot Emacs 20.4

A look at multilingual Emacs.
Emacs Help for the Translator

Once you have entered the multiple scripts into your Emacs buffer, you can marshal all the considerable forces of Emacs to work on your text. This is the big difference between Emacs and a program like Gasper Sinai's Yudit, for example. From a user's perspective, Yudit deals nicely with the encoding problems and it includes support for Unicode from the ground up, but most other features of a well-rounded editor have not yet been implemented.

Since Emacs is multilingual, it is also bilingual. Pick any two-language combination. I translate from Chinese to English (CE), mostly classical texts, and from Japanese to English, mostly commercial work for pay. Emacs has much to offer a JE or CE translator.

Emacs offers such helpful services as saving your cursor position in a buffer between sessions, saving the buffer arrangement of your Emacs session, splitting frames into Emacs windows and placing new frames in strategic locations on your big translator's virtual desktop, la FVWM2. It is also easily customized.

The five items of Emacs arcana that can be especially useful to a translator are saving and backup, outline minor mode, narrowing, abbrevs and bookmarks. See http://www.kanji.com/ for a more detailed version of the following.

Saving and Backup Using autosave

autosave is continuously taking care of the job of saving the latest version of your file. Use ctrl-h v auto-save-interval to check your current value for autosave. In the directory listing, an autosave version of your file is marked by pound signs (#), prefixed and suffixed to the filename.

I use make-backup-file to automatically produce sequentially numbered versions or drafts of my work. In this view, the file is complete right from the beginning, as soon as I save it with a name for the first time. Once my file has a name, it automatically becomes the “first version”.

It took a while to break my old habit of manually saving my file as often as possible, indeed each time I leaned back in my chair. Now I save only when I am ready to take a long break (like several hours or overnight), or when I feel I have reached the logical end of a certain draft version. The result is that I have only one autosave file at any given time, and a new draft (backup) file is produced at meaningful intervals. Use ctrl-h v make-backup-files to check or set. Use ctrl-h v version-control to check your current setting and the “customize” button found therein to turn on numbered backups.

By default, automatic deletion of backups will occur, thereby ruining the use of backups as a translator's simple version control system. Check the variables kept-old-versions and kept-new-versions. The default is 2, i.e., the first two and last two backups are kept; other backups are removed. To keep all backups, I set these variables to 500 each so a thousand backups will be kept before the ones in the middle are removed. After a translation job is finished, I usually delete them all.

Not only will you want to save your files, but also your constellation of visited buffers. This is done through the use of the Emacs desktop. Add these three lines to your .emacs file:

(load "desktop")

You must use meta-x desktop-save to initiate this process and then start Emacs from the same current directory each time you need to recover this state.

You can also save state within an Emacs session by using a register.

Outline Minor Mode

Until a full-scale major mode for translation is written, outline-mode or outline-minor-mode must take on the responsibility of managing the source text, the target text and related notes, comments and references.

In outline mode, there are two kinds of lines: header lines and body lines. Header lines start with a star in the leftmost column. The more stars, the deeper the level into the outline. One star means the line is at the top level. For short jobs, I have only one top-level heading line with the words TOP LEVEL or the title of the job. In longer jobs, I use it for Part One, Part Two, etc. I put contact information that applies to the whole job and notes about the deadline, size, charges and any special provisions as body lines below this top-level heading.

Within my translation environment, body text means notes and commentary. I find it extremely convenient to embed these directly in the working file rather than keep them as separate files. Such inter-linear notes are thus permanently welded to the text to which they relate.

Outline level 2 is always the Japanese or Chinese source text and level 3 is always my target text, the English translation. If I suspect there is a typo in the Japanese source text, my proposed correction can appear in the body lines connected to those level 2 heading lines, i.e., to those particular lines of the Japanese source text. Likewise, definitions, questions, reference sources, comments and notes on words, URLs and anything else that throws light on the translation are set down in body lines that are directly connected to the level 3 heading lines, which are always the English target text I am writing.

When I am finished translating, a keyboard macro strips out the level 3 lines (my translation) and produces the file that will go either directly to the client or through some unavoidable conversion, and even formatting, in a word processor running on Linux, such as ApplixWords.

It would be nice to see outline-mode generalized into a “show and hide” mode so that you could show body lines alone at whatever level you choose.


In effect, outline-mode or outline-minor-mode gives us a pre-structured kind of narrowing. Narrowing is more general in the sense that it can be arbitrarily applied to any portion of text in the buffer. Furthermore, with outline-mode or outline-minor-mode, it is quite possible to edit large chunks of the buffer that are not currently displayed. For example, if you delete or move the heading line, the entire entry under it including its body lines is deleted or moved. Narrowing, on the other hand, restricts editing to the narrowed portion alone. Place a mark at one end and point at the other. Then type ctrl-x n n and the accessible portion of the buffer will be reduced to precisely that region only. See the right-hand buffer in Figures 3 and 4 for examples. In both, the right-hand buffer is narrowed. Only the accessible portion is available for editing. ctrl-x n w widens, to make the entire buffer accessible again.

Figure 3

Figure 4

In Figure 3, what you see is reduced to just the Japanese source text and the English target text, i.e., level 2 and level 3 outline “heading lines”. In Figure 4, what you see is expanded to include snips from on-line dictionaries, notes and comments, i.e., outline “body lines”. But the accessible portion produced by narrowing is the same in both cases. For Figures 3 and 4, I used outline mode to keep source text, translation, abbrevs, glossary entries, notes and commentary all in one file. The left buffer shows only level 2 outline header lines, i.e., the Japanese source text, whereas the right buffer in Figure 3 shows this plus the target text and Figure 4 shows all three: source, target and notes.


I use abbrevs primarily to help enforce consistency on my English target texts but also to avoid some typing.

Let me take an example from my non-commercial work but which applies to all types of CE/JE translation as well. Buddhism has a large vocabulary of “technical terms” that constantly reappear. In the Buddhist texts I work on, five of the most frequent are:

  • prajnaparamita -> pp

  • mahaprajnaparamita-sutra -> mpps

  • utmost, right and perfect enlightenment -> urpe

  • bodhisattva -> bs

  • the buddha said -> tbs

With abbrev mode turned on (meta-x abbrev-mode), typing bs followed by a space instantly inserts bodhisattva, pp inserts prajnaparamita, etc. With abbrev mode turned off, I can still force an insert before point (the position of the cursor) with ctrl-x a e (expand-abbrev).

When I am working on a commercial, technical JE translation job related to Linux, for example, I want to forget about Buddhist-related abbrevs, so I save and load files of abbrevs as appropriate.

Bookmarks and Registers

Bookmarks have arbitrary (long) names and remain from one Emacs session to another. Registers have one-letter names and disappear as soon as a session is ended. A bookmark tells Emacs where to go in a buffer, whereas a register stores data to insert into a buffer, such as a filename, a window configuration, a piece of text or a rectangle. Yes, a register can also store a location and thereby pretend it is a temporary bookmark.

You would think the proper function of a bookmark is simply to mark your place in a file so that whenever you revisit it, you will go to that location; however, I don't need bookmarks for this. I usually leave off at the location I want to return to, so that simply switching buffers via command, menu or ctrl-mouse_button_1 takes me back to where I was. Revisiting a file takes me there too, because I have place-saving turned on. (Use toggle-save-place to turn it on for a specific file and setq-default save-place t to turn it on globally for all files. [Added in Emacs version 19.19.]) Therefore, I feel free to use bookmarks for something less ordinary. ctrl-x r m bookmark_name sets the bookmark; the bookmark name is a string of Japanese or Chinese text, i.e., my glossary entry.

In Japanese, Emacs automatically sorts the kana bookmark names. (Kana are the graphic forms used to write the Japanese syllabary.) When I click on a bookmark, I am propelled directly to the work file where that word or phrase resides in an actual job. Beginning with Emacs 19.29, bookmarks could carry annotations. (Type ctrl-h m after meta-x list-bookmarks for more information on annotations.) I have co-opted this feature to carry the English glossary or meanings of the glossary entries (i.e., the bookmark names). If the annotation is sufficient, I don't need to visit the glossary entry in its location within a working file. If I want to see the context in which I have translated the word, phrase or sentence, one click lets me take a look, regardless of whether that file is currently being visited by an Emacs buffer or not. I can display the location in the file in a buffer that replaces my current bookmark editing buffer or have Emacs put it in a separate window adjacent to the buffer holding the bookmark list. I can also just ask Emacs to tell me the name of the file where the bookmark can be found without actually displaying it.

The one drawback to this neat, but nonetheless makeshift device, is that I can assign only one location to each glossary entry. If I want to add more, I must put them in the annotation.