PostScript, The Forgotten Art of Programming

A tutorial for beginners is presented on writing PostScript files to display data.

The Alparon research group at Delft University of Technology aims to improve automated speech processing systems for information retrieval and information storing dialogues. The current focus is on dialogue management for a research project of Openbaar Vervoer Reisinformatie. The company provides information about Dutch public transport systems, ranging from local bus services to long -distance trains. They are capable of giving up-to-date travel advice from any address to any other address in the Netherlands. Last year they received over 12 million calls for information.

Since we use a corpus-based approach, we analyze tons of data. Due to the size of our data we do just about everything the Unix way: We use only stdin and stdout, and we run our scripts just as sed does (Those who can't program, write C/C++ programs; those who can, try to stick with scripts as long as possible. See also the White Paper in the References). Basically, we torture our data with Perl (and its little friends like awk, sed, tr, grep, find, et al.) until it is in a simple form, e.g., on each line you have an x and a y value.

Although we could import this in some fancy presentation program, we found that the generated PostScript files by these programs are often huge. That might be okay if you have just a couple of figures, but if you have a lot of them, you start to wonder if there is a better way. Of course there is; you can write the PostScript yourself, as I often do. In a Perl script I transform the x-y table into PostScript. Since LATeX requires a bounding box, I always make the PostScript level-1 compliant.

In this article I will give you a crash course in how to write level-1-compliant PostScript—enough instruction so that you can make your own simple figures. I will begin with the basic operators and then we can start drawing lines, filling shapes and drawing text. After that I will present a description of compliant PostScript and an example. I will show you how to draw a histogram, because a histogram has all facets: lines, shapes and text.

PostScript Basics

Normally, when you wish to learn PostScript, you read the Blue Book (see References). If you just wish to know sufficient PostScript for most of your needs, keep on reading. PostScript is a Turing complete stack language. The Turing complete part (well, I am theoretical computer scientist) means that it is as powerful as any other programming language. The stack part means that all computations are carried out on a stack.

For instance, run Ghostscript (not Ghostview) by typing gs. The command pstack, the basic debugging tool, will show you the current stack. Enter 1 2 3 4 pstack at the prompt and a new stack is displayed.

When you type the stack operator pop, 4 is popped off the stack. Next, type exch, and 2 and 3 will swap places. Another handy stack operator is dup, which duplicates the top element. The last important stack operator is roll which takes two arguments, say n and j. The command n j roll (with n and j replaced by numbers, of course) rotates the top n elements of the stack j times. So if the stack shows 1 3 2 2, the command 4 1 roll outputs 2 1 3 2.

PostScript also has all the normal arithmetical operators, but since it is a stack language, you do your arithmetic in reverse Polish notation; i.e., the operators always follow the arguments. The standard arithmatical operators are add, sub, mul, div, idiv (integer division), and mod. PostScript also has geometric, logarithmic and exponential functions.

PostScript works best if you do everything on the stack, but in some cases this isn't particularly convenient. PostScript also has variables, but they are a bit slower than the stack. When you start writing your own PostScript programs, you will often try to do everything with variables—this is considered a Bad Thing. With some practice you will use fewer and fewer variables. To give a variable a value you type:

/PointsPerInch 72 def

which assigns 72 to the variable named PointsPerInch. If you use PointsPerInch, PostScript will replace it with 72.

In PostScript you can also define subroutines. Basically this is the same as assigning a variable, only in this case the value is a code chunk enclosed in curly braces. For example:

/Inch { PointsPerInch mul } def

PostScript also has flow control commands which are beyond the scope of this primer.