Writing HTML with m4

 in
Ease your creation and maintenance of web pages using this handy pre-processor called m4.

It's amazing how easy it is to write simple HTML pages—and the availability of WYSIWYG (what you see is what you get) HTML editors like Netscape Gold lulls one into a mood of “don't worry, be happy”. However, managing multiple, inter-related pages of HTML rapidly gets very difficult. I recently had a slightly complex set of pages to put together, and I started thinking, “there has to be an easier way.”

I immediately turned to the WWW and looked up all sorts of tools—but quite honestly I was rather disappointed. Mostly, they were what I would call “typing aids”—instead of having to remember arcane incantations like <a href="link"7gt;text</a> text, you are given a button or a magic keychord like alt-ctrl-j which remembers the syntax and does all the typing for you.

Linux to the rescue—since HTML is built as ordinary text files, the normal Linux text management tools can be used. This includes revision control tools such as rcs and the text manipulation tools like awk, Perl, etc. These tools offer significant help in version control and managing development by multiple users as well as automating the process of displaying information from a database (the classic grep |sort |awk pipeline).

The use of these tools with HTML is documented elsewhere, e.g., Jim Weirich's article in Linux Journal Issue 36, April 1997, “Using Perl to Check Web Links”. I highly recommend this article as yet another way to really flex those Linux muscles when writing HTML.

What I will cover here is work I've done recently using the pre-processor m4 to maintain HTML. The ideas can very easily be extended to the more general SGML case.

Using m4

I decided to use m4 after looking at various other pre-processors including cpp, the C front-end, which is perhaps a little too C-specific to be useful with HTML. m4 is a generic and clean macro expansion program, and it's available under most Unices including Linux.

Instead of editing *.html files, I create *.m4 files with my favourite text editor. These files look something like the following:

m4_include(stdlib.m4)
_HEADER(`This is my header')
<P>This is some plain text<P>
_HEAD1(`This is a main heading')
<P>This is some more plain text<P>
_TRAILER

The format is just HTML code, but you can include files and add macros rather like in C. I use a convention that my new macros are in capitals and start with an _ character to make them stand out from HTML language and to avoid name-space collisions.

The m4 file is then processed as follows to create an .html file using the command:

m4 -P <file.m4 >file.html

This process is especially easy if you create a makefile to automate these steps in the usual way. For example:

.SUFFIXES: .m4 .html
.m4.html:
        m4 -P <$*.m4 >$*.html
DEFault:        index.html
*.html: stdlib.m4
all:    default PROJECT1 PROJECT2
PROJECT1:
        (cd project2; make all)
PROJECT2:
        (cd project2; make all)
Some of the most useful commands in m4 are listed here with their cpp equivalents shown in parentheses:
  • m4_include: includes a common file into your HTML (#include)

  • m4_define: defines an m4 variable (#define)

  • m4_ifdef: a conditional (#ifdef)

  • m4_changecom: change the m4 comment character (normally #)

  • m4_debugmode: control error diagnostics

  • m4_traceon/off: turn tracing on and off

  • m4_dnl: comment

  • m4_incr, m4_decr: simple arithmetic

  • m4_eval: more general arithmetic

  • m4_esyscmd: execute a Linux command and use the output

  • m4_divert(i): This is a little complicated, so skip on first reading. It is a way of storing text for output at the end of normal processing. It will come in useful later, when we get to automatic numbering of headings. It sends output from m4 to a temporary file number i. At the end of processing, any text which was diverted is then output, in the order of the file number i. File number -1 is the bit bucket and can be used to comment out chunks of comments. File number 0 is the normal output stream. Thus, for example, you can use m4_divert to divert text to file 1, and it will only be output at the end.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState