What's GNU?

In this two-part article, Arnold shows us how small is beautiful when it comes to user interfaces.

This column briefly describes Plan 9 From Bell Labs, an operating system done by the original group at Bell Labs that did Unix. We will be focusing on the user interface part of Plan 9. It is interesting, since the major components are either freely available from AT&T, or have been cloned in freely available software. The article will be concluded next month.

In the late 1980's, the research group at Bell Labs started to feel that Unix had reached the end of its useful life as a research vehicle. They decided that it was time to start over, taking the useful lessons learned from Unix, and going on from there. A brand new operating system was developed, named Plan 9 From Bell Labs.

The result is documented in two sets of papers. The early papers discuss the overall design of Plan 9, its shell, compiler, and window system. The later set contains additional papers about the system and the entire reference manual for the system. What is really neat is that PostScript for all of this is available via anonymous ftp (see the sidebar). The reference manual is huge, over 650 pages; it helps to print it on a duplexing printer, if you have one available. A mailing list of Plan 9 licensees and other folks who are interested in Plan 9 is also available.

Plan 9 is a distributed system. It consists of three components: File servers, where all the user files live; CPU servers, where computing intensive tasks are done; and terminals, which handle the user interface. The compute and file servers are large machines that live in a machine room. At Bell Labs, they are connected by a high-speed fiber network, although the software does not require this. The terminals are small computers with mice, keyboards, bitmapped displays, and network connections to the file and compute servers. Terminals may have local disk drives for performance reasons, but they are used for caching files and are not strictly necessary.

Plan 9 is also a heterogeneous system. The operating system has been ported to the MIPS, Motorola 680x0, Intel 80386/486, and Sun SPARC architectures. At Bell Labs, they tend to use the MIPS systems for their servers and the other systems for the terminals, but again, that is not built in to the software.

Plan 9 also has a number of nice innovations in the software architecture seen by the programmer. As a simple example of this, in Unix, there are multiple system calls that affect the meta-information about a file (owner, mode, etc) such as chown, chmod, and utime. In Plan 9, there is only one, wstat, which writes the stat information about a file. As another example, all user and group names are returned by the system as strings, the programmer never has to manage the conversion between numeric user ids and strings. There are many other very elegant improvements upon the Unix design in Plan 9.

Plan 9 is also one of the first systems to use Unicode, a 16 bit character set. The sam and 9term programs discussed below also support Unicode, making it possible, for example, to type a real smiley character, instead of the usual three-character ASCII glyph.

(A parenthetical note on my soapbox. In many ways, Plan 9 is a considerably superior design over Unix. It would be worthwhile for those interested in a free version of Plan 9 to consider starting from the Linux code base, using the device drivers, memory management, and whatnot. Linux itself is and will remain a Unix clone, and Unix is not Plan 9. Starting from Linux will be particularly easy when Linux 2.0 comes out, as it will be multi-platform, like Plan 9, or so I'm told.)

This should whet your appetite. Both the early and the current Plan 9 papers are well worth reading. The manual is also fun to browse.

Plan 9 is not (unfortunately) generally available. Universities may license it from AT&T for no cost (other than time spent by the lawyer to review the license). Upon signing a license, AT&T sends one hard-copy of the manual and a CD-ROM. The current (as of December 1994) release is almost two years old, and the system has evolved somewhat. A new release, using PC based hardware as the porting base, is in preparation, but no release date is known yet. The AT&T researchers are working towards a way to release it more generally, but it will still require some kind of license; it will not be freely available the way Linux, NetBSD, or FreeBSD are.

In this article, we will take a look at the Plan 9 editor, windowing system, and shell. They are important, because the editor is freely available, and there are freely available clones of the others.

The sam Editor

The Plan 9 editor is named sam. (Some history here. The original Unix editor was ed. It was command-driven. Rob Pike wrote a mouse-driven editor for the Blit terminal named jim. The successor to jim was sam, also written by Rob Pike. Basically, they're all a bunch of friendly, down to earth sort of programs.... :-) (I'm told that sam is short for “samantha”, and female.)

sam is a multi-file, multi-window editor that elegantly combines extended regular expressions (egrep-style) and the powerful ed command set with mouse driven text selection, cutting, and pasting. In particular, all operations act upon the selected text, which can include multiple lines. Replacement text can include newline characters as well.

sam also provides an infinite “undo” capability, so you don't have to worry about making mistakes.

One of the windows that sam provides you is the command window, where you type in commands. What is nice is that, just like the text in any other window, you can edit the text in the command window, then select the edited line with the mouse, and send it again as input. In other words, you can edit previous commands and submit them for execution again. If a substitution didn't work or do quite what you wanted it to do, undo the change, edit the command, and try again. Do this as often as you like. Or, if you used a series of commands on a chunk of text once, and need to do that series again, select all the command lines, and send them all at once. (The command window is similar to the mini-buffer in Emacs.)

As an example, when replying to email, I'll often include the original letter, preceded with > signs. Sometimes I end up with text that has only part of a line, like this:

> So what
> is your opinion about the future life of
> systems like MVS, VMS, VM, and Solaris?

I can select these lines as a single group, and then reformat it with the following commands:

s/^> //g
s/^/> /g

This removes the > signs, runs the text through fmt to make it look nice, and then adds the > signs back in. The result might be:

> So what is your opinion about the future life of systems
>like MVS, VMS, VM, and Solaris?

(In fact, I was able to snarf the commands out of my article text, paste them into the command window, edit them a bit, and then submit them to make the new text above.)

The command language is particularly powerful, using a notation called “structural regular expressions”. Essentially, regular expressions can be cascaded together to select increasingly more specific chunks of text upon which to operate. Here is an example from the sam paper. Suppose you wish to change all occurrences of a variable n to now be called num. You could use the following command:

 , x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/

The comma selects all lines (an abbreviation for 0 through $, the last line). The x command extracts text to operate upon. It is an iterator, meaning that the command following it will be executed for each match of the text. The sam paper explains the rest of the command: “The pattern [A-Za-z_][A-Za-z_0-9]* matches C identifiers. Next, g/n/ selects those containing an n. Then v/../ rejects those containing two (or more) characters, and finally c/num/ changes the remainder (identifiers n) to num.” The g and v commands are conditionals. g says execute the command only if the pattern matches; v is the opposite—execute the command only if the pattern does not match.

Simple changes are often made with the mouse. But for complex, sweeping changes, a command language such as the one in sam is essential. Indeed, this is why vi includes the ed command set as a subset.

As mentioned, sam is a multi-file editor. You can have several files open in windows at once, and several windows on the same file. This is particularly useful for cut and paste operations when going from one file to the next. The command language also provides commands for doing operations on all files that contain, or do not contain, a particular regular expression.

To summarize why I find sam attractive:

  1. It is multi-file and multi-window.

  2. It has a powerful command language that makes many editing operations easy.

  3. It is possible to edit your commands.

This last is particularly useful; it is one of those things that once you have it, you can't believe you ever lived without it.

sam is implemented on top of two libraries. The libframe library provides windows (frames) of text. This library is implemented in turn on top of libg, a graphics library. For Unix, the Plan 9 sam and libframe code is used, essentially unchanged, on top of libXg, an implementation of libg for X windows using the Xt toolkit. All of this software supports Unicode. It is possible, for example, to enter the 1/2 symbol by typing ALT-1-2.

See the sidebar for the ftp location of sam; AT&T has graciously made it available free of any licensing worries. There is also a mailing list for sam users.

The mailing list archive includes a sam emulator for Emacs, written by Rick Sladkey (jrs@world.std.com).