Typesetting with groff Macros

“Reports of troff's death are greatly exaggerated.” —W. Richard Stevens, 1998
Variables and Name Spaces

Groff has a set of about 50 predefined variables called number registers. These are the internal gauges of groff's typesetting machinery. While processing an input file, groff maintains these registers with the current value of such variables as page number, position on page and point size. Number registers are in a separate namespace from strings and macros, and are aliased with their own alias command, as in the following:

.ALIAS  ALIASNR aln.ALIASNR        _PTSIZE .s
.ALIASNR        _LEADING        .v

In this example, we first alias the command for aliasing numbers, adapting the methodology we used earlier. Then, we alias the read-only registers for the current point size and vertical line spacing, choosing to use the traditional typesetting terminology—“leading”—for the latter. Although not required, the above example also demonstrates the use of a specific convention we follow, to prefix aliases for system variables with an “_” (underscore character).

You can, of course, follow your own heart in these matters. But the use of a naming convention may help to distinguish the variables themselves from the names of the commands that set the variables, such as:

.ALIAS   PTSIZE   ps.ALIAS   LEADING  vs

These might be used in a macro as follows:

.MACRO    <fontsize:>    __END__.  PTSIZE  \\$1
.  IFELSE  "\\$2""  \{\
.      LEADING ( \\n[_PTSIZE]u * 120/100 )
.  \}
.  ELSE \\{\
.      LEADING  \\$2
.  \}
.__END__
With usage in a document:
.<p>A message to the world:
.<p>
.<fontsize:>  18p
Is groff great or what?
The first line of the macro sets the current point size to the value of the first argument to the macro. The second line introduces a compound if/else statement, using groff's string comparison syntax for the logical test. If the second argument is empty, the leading is set by taking the value of the point size now in the numeric register _PTSIZE, and increasing it by 20%. Otherwise, the leading is set to the value provided by the second argument.

Parentheses in a numeric expression permit the use of spaces within the expression. Otherwise, in the example above, we would need to use the less legible form without any spaces:

.LEADING  \\n[_PTSIZE]u*120/100

Numeric expressions are evaluated simply left to right, there are no operator precedence rules, and parentheses are required to explicitly change the order of evaluation.

All arithmetic operations and number registers are ultimately integer based. Groff internally translates all dimensional measurements into machine units (based on 72,000 units per inch for PostScript devices), providing a functional “illusion” of fractional dimensions and point sizes. This allows us to specify decimal terms such as 8.5i and 11.5p, which, in fact, evaluate to 612,000 and 11,500 machine units respectively. Numeric values can be specified in any of the units shown in Table 1.

Table 1. Groff Units

In practice, groff's internal use of integral math can have significant consequences for the macro developer. Consider what would happen if the expression above were instead stated:

.LEADING  (\\n[_PTSIZE]u * (120/100))

Using integer division, the parenthetical term of 120/100 would evaluate to one and the entire expression would then evaluate to the current point size, and not 20% larger as intended.

As it turns out, not all predefined number registers are, in fact, numeric. For example, the name of the current file being processed is in the read-only register .F:

.ALIAS     MESSAGE  tm.ALIASNR   _LINE    .c
.ALIASNR   _FILE    .F
.MESSAGE  Currently processing file \n[_FILE], line \n[_LINE].

Although both variables are evaluated using the syntax for number registers, _FILE returns the name of the current file as a string. Despite this anomaly, groff permits only numeric expressions in user-defined number registers. The example here, by the way, is one means of inserting debugging messages in your macro file during development. The .tm request—aliased above to .MESSAGE—sends any text that follows to the standard error stream.

Macros and “Copy” Mode

Observant readers may be wondering why the syntax for evaluating the number registers inside the <p> macro have two backslashes (e.g., \\n[#PARSKIP]u), rather than one (e.g., \n[_LINE]) as are shown above. The difference is subtle but important.

The reason for using two backslashes inside macro definitions is that we usually don't want the expression inside the macro to be evaluated at the time the macro is first read. Rather, we would like the expression to be evaluated every time the macro is played back. A double backslash is groff's escape sequence for the backslash character itself, providing the means of getting a single backslash to print in your output. When groff is reading in a macro for the first time—in what is called “copy mode”—it interprets everything as it usually does, including escape sequences. So when a double backslash is encountered in a macro definition, groff converts it to the single backslash the sequence represents. Then, whenever the macro is played back, the single backslash remaining is interpreted in the usual manner.

Although we could define macro variables with a single backslash, such as:

.MACRO   <p>.SKIP   \n[#PARSKIP]u
\# etcetera

This macro would always execute with the amount of paragraph prespace specified in the variable #PARSKIP at the time the macro was first read. You would be stuck with the same #PARSKIP for your whole document. By using two backslashes, as in our original definition of <p>, we can dynamically change the #PARSKIP variable anywhere in the document and as often as we like, for example:

\# user interface for setting parskip:.MACRO  <parskip:>    __END__
.    NUMBER  #PARSKIP  \\$1
.__END__
\#
\# tighten spacing between paragraphs:
.<parskip:>  0.4v
The new setting will now affect the format of all instances of the <p> macro that follow.

As we could expect, groff offers a useful extension in this area as well. The “\E” sequence represents an escape character that will not be interpreted in copy mode. So, our <p> macro could just as easily be written:

.MACRO  <p>.SKIP   \En[#PARSKIP]u
\# etcetera

The “\E” sequence will provide the same result as the “\\” double backslash sequence.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix