Literate Programming Using Noweb
In essence, the purpose of literate programming (LP) can be found in the following quote:
“Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do.”—Donald E. Knuth, 1984.
Such an environment reverses the notion of including documentation, in the form of comments within the code, to one where the code is embedded within a program's description. In doing so, literate programming facilitates the development and presentation of computer programs that more closely follow the conceptual map from the problem space to the solution space. This, in turn, leads to programs that are easier to debug and maintain.
When writing literate programs, one specifies the program description and the program code in a single source file, in the order best suited to human understanding. The program code can be extracted and assembled from this file into a form which the compiler or interpreter can understand—a process called “tangling”. Documentation is produced by “weaving” the description and code into a form ready to be typeset (most often by TeX or LATeX).
Many different tools have been created for literate programming over the years. Most of the more popular are based, either directly or conceptually, on the WEB system created by D. E. Knuth (“Literate Programming”, The Computer Journal (27)2:97-111, 1984). This article focuses on Norman Ramsey's noweb—a simple to use, extensible literate programming tool that is independent of the target programming language.
When you write a literate program using noweb, you create a simple text file (which by convention has a .nw extension) in which you provide all of the technical documentation for the various parts of the program, along with the actual source code for each part of the program.
This file ( Listing 1 ), which we call the nw source file, is then processed by noweave to create the documentation in a form ready for typesetting (the typeset version of the program is shown in Figure 1), or by notangle to extract the code chunks and assemble them in their proper order for the compiler or interpreter (the executable version of the program is in Listing 2 ). These two processes are not stand-alone programs, but a set of filters through which the nw source file is piped. It is this pipeline system that makes noweb both flexible and extensible, since the pipelines can be modified and new filters can be created and inserted in the pipelines to change the behavior of noweb.
Like most literate programming tools, noweb depends on TeX or LATeX—(LA)TeX to refer to either—for typesetting the documentation (although it has options for producing HTML output as well). However, one need not be a (LA)TeX guru to produce good results. All of the hard work of cross-referencing, indexing and typesetting the code is handled automatically by noweave.
The best way to get a feel for the capabilities of noweb is by reference to the finished product: the typeset version of a program. Figure 1 represents the typeset version of a Perl script that actually extends noweb's functionality by providing a limited “autodefs” filter. This filter will recognize and mark package and subroutine names for automatic cross-referencing and indexing.
When looking at this example, one can quickly see how chunks of actual code are interspersed throughout the descriptive text. Each code chunk is uniquely identified by page number and an alphabetic sub-page reference. For example, in Figure 1, there are four code chunks on the first page labeled in the left margin as 1a, 1b, 1c and 1d.
Besides the marginal tag, the first line of each code chunk also has its name and a chunk reference enclosed in angle brackets at the left margin and perhaps cross-reference information at the right margin. Lets examine chunk 1b more closely—a reasonable facsimile of its first line is:
This line tells us that we are now in chunk 1b. The <Global variables 1a>+= construct tells us we are working on the chunk named Global variables whose definition begins in chunk 1a. The += indicates that we are adding to the definition of Global variables. At the right margin we encounter (1d) <1a 1c>, which means that the chunk we are defining is used in chunk 1d, and that the current chunk is continued from chunk 1a and will be further continued in chunk 1c. It should be noted that all of these visual cross-referencing clues—with the exception of the chunk name itself—are provided automatically by noweb.
At the end of any chunk there are two optional footnotes—a “Defines” footnote and a “Uses” footnote. A user can manually specify, in the nw source file, a list of identifiers (i.e., variables or subroutines) which are defined in the current chunk. In addition, some identifiers may be automatically recognized, if an “autodefs” filter for the programming language is used. There are autodefs filters available for many languages including C, Icon, TeX, yacc and Pascal).
These identifiers are listed in the “Defines” footnote below the chunk where their definition occurs, along with a reference to any chunks which use them. Any occurrence of a defined identifier is referenced in a “Uses” footnote below the chunk that uses that identifier.
For example, in Figure 1, we see that chunk 1c defines the term $index_prefix which is used in chunk 2b. A quick peek at chunk 2b verifies that, indeed, this term is used and appears in the “Uses” footnote for that chunk.
Chunk 1d, autodefs.perl, represents the top level description of our entire program. This chunk is referred to as a “root” chunk in noweb and is not used in any other chunk. Our example has but one root chunk, although as many as you wish can be defined in your nw source file, and notangle can extract each of them into separate files.
The first line of code in chunk 1d is the obligatory #!/usr/bin/perl line which must begin all Perl scripts intended to be invoked as an executable program. However, the next two lines are not lines of Perl code at all but instead are references to other named chunk definitions. The code from those referenced chunks will be inserted at this point in the executable program extracted by notangle. Thus, we have a broad overview of our program, uncluttered by the specific global variable initializations and subroutine definitions.
Looking at chunk 2a, which is included in our root chunk, we see that it also includes another chunk, chunk 2b. This demonstrates that the inclusion of chunks can be nested to practically any level and can occur in any order in the documentation (definitions need not precede uses).
Our documentation ends with two optional indices provided by noweb—an index of code chunks and an index of identifiers.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Client-Side Performance
- Tibbo Technology's Tibbo Project System
- Sony Settles in Linux Battle
- Peppermint 7 Released
- Libarchive Security Flaw Discovered
- Maru OS Brings Debian to Your Phone
- The Giant Zero, Part 0.x
- Profiles and RC Files
- Git 2.9 Released
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide