Literate Programming Using Noweb

Noweb is a tool designed to enable a programmer to write documentation and code at the same time, with the goal of producing code that is easy to understand and maintain.
Writing the Program in Noweb

With the knowledge of what comes out the end of the pipeline in hand, we can now describe the structure of the nw source file itself. The nw source file for our example program is given in Listing 1.

When you write a noweb program, you alternate between explaining some piece of code and providing the formal definition of that piece of code. You must indicate whether you are entering documentation or code by using one of two noweb tags.

To begin writing documentation, one starts with an @ symbol in the left column, followed by either a space or a newline. This indicates that all of the text that follows, at least up to the next tag, is documentation text. All documentation text is passed through the filtering process to the (LA)TeX file. Thus, the author is responsible for providing any special formatting such as sections, tables, footnotes and mathematical formulae which may be desired or needed in the documentation.

In addition to the standard (LA)TeX command set, noweb provides three additional control sequences. Any text surrounded by double square brackets in the text is typeset in the same fashion as literal code, and the \nowebindex and \nowebchunks commands expand into the two types of indices shown at the end of our example in Figure 1.

To indicate the beginning of a code chunk, you use double angle brackets surrounding a name for the code chunk followed by an equal sign:

<<code_chunk_name>>=

Everything following this construct is considered to be literal code or a reference to another chunk name. You reference another chunk name by placing its name in double angle brackets with no trailing equal sign. As with documentation, a code chunk is terminated when another tag is encountered. To continue a code chunk definition, you start a new code chunk using the same name within the brackets as the chunk to be continued.

The special formatting and cross-referencing of code chunks is handled automatically by noweb and requires no special input by the user—with the one exception of manually specifying identifier definitions.

To manually list identifiers which are defined in a given chunk, you terminate that chunk with a line of the form:

@ %def

The identifiers given on the line will be placed in a “Defines” footnote for that chunk and will automatically be cross-referenced and indexed by noweb as described in the previous section.

The process by which notangle extracts the code into a form suitable for the compiler or interpreter follows just a few simple rules. A root chunk is specified on the command line as the chunk to be extracted and assembled. This chunk is then printed line by line until a reference to another chunk is encountered. At this point, the referenced chunk is output line by line—and similarly for any chunks referenced therein. When the referenced chunk has been output, notangle continues the process of outputting the root chunk.

When dealing with continued chunks—two or more chunks sharing the same name—notangle concatenates their definitions in order of appearance into a single, named chunk. The extracted code for our example program is in Listing 2, and it can be seen that all spacing and indentation is preserved appropriately in the executable version.

Because of the way notangle extracts and assembles its input, the program can be presented and explained in the best order for human understanding. notangle will make sure that the program chunks are in the right order for the compiler or interpreter.

The Incantations

Now that we know how to create a program in noweb, we can examine the methods of generating our typeset and executable versions of the program. The noweb distribution provides a general shell script called, remarkably, noweb which drives the notangle and noweave processes. However, this method of invocation, though simple, is somewhat limited. We will focus here on using each tool separately as this method provides a more flexible approach.

When you invoke notangle, you specify a chunk name (a root chunk) to extract and assemble from the nw source file. If you fail to specify a chunk, notangle searches for a chunk named * to extract (this is the default root chunk in a noweb program). The notangle tool writes to stdout, so you must redirect this to a file of your choice. The general form of the command is:

notangle [-R
        [-filter

Thus, to extract the executable version of our example program we use:

notangle -Rautodefs.perl autodefs.perl.nw >
         autodefs.perl
The -R option specifies which root chunk to extract. The -L option is used to embed line directives, if they are supported by the compiler/interpreter you are using. The line directives refer to locations in the nw source file. Thus, when debugging your code, you never need to refer to the executable version. Rather, you can edit the code in the nw source file. The default format is for use with the C preprocessor, but it also works well with Perl, with one catch. The line directives are emitted whenever a chunk is entered or returned to, and refer to the next line of code. Therefore, in a script such as ours, a line directive winds up as the first line of the executable version before the #! line, rendering it non-executable. The fix for this is to delete the first line directive, or to move it below the first line and increment the line number by one.

One can write filters for use with either notangle or noweave that manipulate the source once it is in the pipeline. The pipeline representation of the nw source file in noweb is beyond the scope of this article (see the “Noweb Hacker's Guide” included in the documentation of the distribution). However, it should be noted that a filter could easily be constructed which automates the solution to the line directive problem.

The typeset version of the program is generated with the noweave tool. There are several useful options for noweave, all detailed in the man pages. We will only consider a few of the most important options here.

The first option of general interest concerns the desired output: you can specify -latex (default), -tex or -html as the formatting language to be used for the final documentation. Each of these options will supply an appropriate wrapper (which can be suppressed with the -n option) for the typeset version. You can write your nw source file intended for LATeX typesetting and still have the option of producing an HTML document by invoking noweave with the -html option and the LATeX-to-html filter (-filter l2h) included with the distribution.

The -x option enables cross-referencing and indexing of chunk names, as well as any identifiers which are automatically recognized by an “autodefs” filter. Using the -index option implies -x and provides cross-referencing and indexing for manually defined identifiers—those mentioned in @ %def statements in the nw source file.

Normally, noweave inserts additional information, such as the filename for use in page headers, as part of its wrapper. The -delay option causes noweave to suspend the insertion of this information until after the first documentation chunk. This is most useful when you wish to provide your own (LA)TeX wrapper to specify additional packages or to define your own special formatting commands. This implies a -n (omission of wrapper) option and requires that you make sure to include a \end{document} control sequence in a documentation chunk at the end of the file to complete the wrapper. Our example nw source file is written in this fashion.

Our typeset version (Figure 1) was produced by first extracting the autodefs.perl root chunk with notangle and making it executable using the chmod system command. We then placed this executable in the noweb library directory and invoked noweave as:

noweave -autodefs perl -delay -index
        autodefs.perl.nw > autodefs.tex

We can then run LATeX on the resulting file—twice, to resolve page references—to create the dvi file, then use dvips to create the postscript version for inclusion with this article.

Additional options allow you to have the index created from an external file, to expand tabs and to specify alternative formatting options provided by the included noweb.sty file. The latter includes options to omit chunk numbering in the left margins, change text size in code chunks and switch from the symbolic cross-referencing of code chunks occurring at the right margin to simple footnote style cross-referencing similar in style to the “Defines” and “Uses” footnotes.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix