Typesetting with groff Macros

“Reports of troff's death are greatly exaggerated.” —W. Richard Stevens, 1998
Variables and Name Spaces

Groff has a set of about 50 predefined variables called number registers. These are the internal gauges of groff's typesetting machinery. While processing an input file, groff maintains these registers with the current value of such variables as page number, position on page and point size. Number registers are in a separate namespace from strings and macros, and are aliased with their own alias command, as in the following:

.ALIAS  ALIASNR aln.ALIASNR        _PTSIZE .s
.ALIASNR        _LEADING        .v

In this example, we first alias the command for aliasing numbers, adapting the methodology we used earlier. Then, we alias the read-only registers for the current point size and vertical line spacing, choosing to use the traditional typesetting terminology—“leading”—for the latter. Although not required, the above example also demonstrates the use of a specific convention we follow, to prefix aliases for system variables with an “_” (underscore character).

You can, of course, follow your own heart in these matters. But the use of a naming convention may help to distinguish the variables themselves from the names of the commands that set the variables, such as:

.ALIAS   PTSIZE   ps.ALIAS   LEADING  vs

These might be used in a macro as follows:

.MACRO    <fontsize:>    __END__.  PTSIZE  \\$1
.  IFELSE  "\\$2""  \{\
.      LEADING ( \\n[_PTSIZE]u * 120/100 )
.  \}
.  ELSE \\{\
.      LEADING  \\$2
.  \}
.__END__
With usage in a document:
.<p>A message to the world:
.<p>
.<fontsize:>  18p
Is groff great or what?
The first line of the macro sets the current point size to the value of the first argument to the macro. The second line introduces a compound if/else statement, using groff's string comparison syntax for the logical test. If the second argument is empty, the leading is set by taking the value of the point size now in the numeric register _PTSIZE, and increasing it by 20%. Otherwise, the leading is set to the value provided by the second argument.

Parentheses in a numeric expression permit the use of spaces within the expression. Otherwise, in the example above, we would need to use the less legible form without any spaces:

.LEADING  \\n[_PTSIZE]u*120/100

Numeric expressions are evaluated simply left to right, there are no operator precedence rules, and parentheses are required to explicitly change the order of evaluation.

All arithmetic operations and number registers are ultimately integer based. Groff internally translates all dimensional measurements into machine units (based on 72,000 units per inch for PostScript devices), providing a functional “illusion” of fractional dimensions and point sizes. This allows us to specify decimal terms such as 8.5i and 11.5p, which, in fact, evaluate to 612,000 and 11,500 machine units respectively. Numeric values can be specified in any of the units shown in Table 1.

Table 1. Groff Units

In practice, groff's internal use of integral math can have significant consequences for the macro developer. Consider what would happen if the expression above were instead stated:

.LEADING  (\\n[_PTSIZE]u * (120/100))

Using integer division, the parenthetical term of 120/100 would evaluate to one and the entire expression would then evaluate to the current point size, and not 20% larger as intended.

As it turns out, not all predefined number registers are, in fact, numeric. For example, the name of the current file being processed is in the read-only register .F:

.ALIAS     MESSAGE  tm.ALIASNR   _LINE    .c
.ALIASNR   _FILE    .F
.MESSAGE  Currently processing file \n[_FILE], line \n[_LINE].

Although both variables are evaluated using the syntax for number registers, _FILE returns the name of the current file as a string. Despite this anomaly, groff permits only numeric expressions in user-defined number registers. The example here, by the way, is one means of inserting debugging messages in your macro file during development. The .tm request—aliased above to .MESSAGE—sends any text that follows to the standard error stream.

Macros and “Copy” Mode

Observant readers may be wondering why the syntax for evaluating the number registers inside the <p> macro have two backslashes (e.g., \\n[#PARSKIP]u), rather than one (e.g., \n[_LINE]) as are shown above. The difference is subtle but important.

The reason for using two backslashes inside macro definitions is that we usually don't want the expression inside the macro to be evaluated at the time the macro is first read. Rather, we would like the expression to be evaluated every time the macro is played back. A double backslash is groff's escape sequence for the backslash character itself, providing the means of getting a single backslash to print in your output. When groff is reading in a macro for the first time—in what is called “copy mode”—it interprets everything as it usually does, including escape sequences. So when a double backslash is encountered in a macro definition, groff converts it to the single backslash the sequence represents. Then, whenever the macro is played back, the single backslash remaining is interpreted in the usual manner.

Although we could define macro variables with a single backslash, such as:

.MACRO   <p>.SKIP   \n[#PARSKIP]u
\# etcetera

This macro would always execute with the amount of paragraph prespace specified in the variable #PARSKIP at the time the macro was first read. You would be stuck with the same #PARSKIP for your whole document. By using two backslashes, as in our original definition of <p>, we can dynamically change the #PARSKIP variable anywhere in the document and as often as we like, for example:

\# user interface for setting parskip:.MACRO  <parskip:>    __END__
.    NUMBER  #PARSKIP  \\$1
.__END__
\#
\# tighten spacing between paragraphs:
.<parskip:>  0.4v
The new setting will now affect the format of all instances of the <p> macro that follow.

As we could expect, groff offers a useful extension in this area as well. The “\E” sequence represents an escape character that will not be interpreted in copy mode. So, our <p> macro could just as easily be written:

.MACRO  <p>.SKIP   \En[#PARSKIP]u
\# etcetera

The “\E” sequence will provide the same result as the “\\” double backslash sequence.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState