Industrializing Web Page Construction
December 1st, 1997 by Pieter Hintjens in
When I started building my company's web site about a year ago, I looked for a good, visual web editor, and finding one quickly produced some nice web pages. A week later, I had thrown the web editor away and was working on a tool to solve some of the major difficulties I had found. In this article I'll look at the result—a free HTML preprocessor written in Perl—that makes mass production of web pages a feasible and economical task.
htmlpp was one of the first Perl programs I wrote, and I've not regretted the choice of language. Perl allows me to add functions to the program as fast as I can think of them. The consequence is that htmlpp is a very rich tool, making the task of maintaining a web site with thousands of pages easy.
There are at least a dozen free HTML preprocessors available today; I know of three with the name htmlpp. Something is driving people to write these programs, but what? Some 95% of the web pages I produce are on-line documentation, and I dislike building these by hand. Each page needs a standard header, footer and appearance. When I change my mind, it takes a lot of mouse clicks to go through each web page again, and a lot of care to make sure that every page conforms to my preferred style.
Thus, I started htmlpp with the idea: “take a large text file and break it into smaller web pages, adding pretty headers and footers, building the table of contents, cross-references and hyperlinks.” It would also be nice to define symbols like $(version) and place them into the text. How about conditional blocks so that I can generate frame and non-frame web pages from the same document, a way to share definitions between projects, a for loop to build structured text, access to environment variables and Perl macros, some more hot coffee and a raisin bagel?
htmlpp uses the term “document” to refer to the text files it inputs. This is a “hello world” document:
.echo Hello, World.
Here's something more involved:
.define new-year 0101
.if "&date("mm-dd")" eq "$(new-year)"
. echo Happy New Year!
.else
. echo Hello, World.
.endif
If you've used C or C++, htmlpp looks very much like the C
preprocessor. You get commands like
.define,
.include and
.if that work in a similiar
fashion to the C preprocessor equivalents. For instance, the .if
command works at “compile time”, i.e., when you build the HTML
pages, not when they are displayed by the browser. Some other
htmlpp commands were borrowed from the Unix shells.
Note how I define a symbol, new-year, and then use it in the document as $(new-year). htmlpp provides many variations on this theme; for example, the $(*...) form creates a hyperlink:
.define lj http://www.ssc.com/lj/ $(*lj="Linux Journal"<\n>) is the magazine of the Linux community.
To define a counter which runs from 0 upwards:
.define counter++ 0A realistic htmlpp script uses the .page command to create HTML pages. Listing 11 shows the template file supplied by htmlpp for your new projects.
Each HTML page gets a header and a footer. htmlpp lets you construct very complex headers and footers. This footer, taken from the htmlpp documentation, builds hyperlinks to the first, previous, next and last pages in the document, plus an index that lets the user jump to any page in the document.
.block footer <HR><P> | $(*FIRST_PAGE=<<) | $(*PREV_PAGE=<) | $(*NEXT_PAGE=>) | $(*LAST_PAGE=>>) .build index <P><A HREF="/index.htm"> <IMG SRC="im0096c.gif" WIDTH=96 HEIGHT=36 ALT="iMatix"></A> Designed by <.HREF "/html/pieter.htm" "Pieter Hintjens"> © 1997 iMatix </BODY></HTML> .endblock
The .build index command builds the index by making a list of all the pages in the document. With an .if command, we can show the current page in relationship to the other pages. This is how I define the index:
.block index_open <BR> .block index_entry .if "$(INDEX_PAGE)" eq "$(PAGE)" | <.EM $(INDEX_TITLE)> .else | $(*INDEX_PAGE="$(INDEX_TITLE)") .endif .endblockThis code is beginning to get a bit complex, but the results are well worth the effort. The symbols in capital letters (e.g., $(PAGE), the file name for the current HTML page) are supplied by htmlpp. Some of these symbols, such as $(NEXT_PAGE), require that htmlpp go over the document several times. In fact, htmlpp will run through the document three or more times, until all cross references have been resolved. This multi-pass approach can be a little slow, but it is powerful enough to handle the footer block shown above.
The .build toc command builds a table of contents, a vital part of any large document. htmlpp comes with a small file, contents.def, that does this job. To build the table of contents, you do the following:
.include contents.def
The contents.def file first defines three blocks (toc_open, toc_entry and toc_close) and then does a .build toc:
.block toc_open <MENU> .block toc_entry <LI><A HREF="$(TOC_HREF)">$(TOC_TITLE)</A></LI> .block toc_close </MENU> .end <P> .build toc <HR>htmlpp uses such predefined blocks for headers, footers, indexes, table of contents and other constructions. You can define your own blocks in order to pull standard chunks of HTML text into your pages. You can also use .include commands, but this practice can lead to the creation of many small files.
The key to unlocking htmlpp's real power is learning a little Perl. When you use the .if command, for instance, you use Perl. So, I can write something like this:
.if $ENV {"RELEASE"} eq "test"
It's also possible to run Perl programs and pipe the output into your HTML pages or to extend htmlpp's syntax with your own functions. Finally, since htmlpp comes with source code under the GNU General Purpose License, you can change the tool in any way you wish.
At the other extreme, you can use htmlpp in “guru mode” to turn a simple text file into structured HTML pages. All you need to do is mark the section headers. htmlpp inserts a table of contents, breaks the document into pages, adds headers and footers, detects numbered and bulleted lists, paragraphs, tables and so on. This is a quick and lazy way to produce useful HTML pages without tagging every paragraph.
To use htmlpp, you have to be happy writing HTML by hand (unless you work in guru mode). In return, you get an economical way to maintain large web sites without losing any control over the quality of your work.
To install and use htmlpp, you need Perl version 4 or 5. Download htmlpp from http://www.imatix.com/ and unpack the .zip file. The package comes with HTML pages describing how to install and use. If you have questions, comments or suggestions, don't hesitate to send me e-mail.
Reply
Subscribe now!
The Latest
Newsletter
Featured Videos
Set up a secure virtual host in Apache
December 22nd, 2008 by Elliot Isaacson in
Setting up an https server in Apache is easy. This tutorial covers how to create and sign your ssl certificate as well as how to configure the web server.
Recently Popular
From the Magazine
January 2009, #177
It's a battle as old as time: good vs. evil. Fortunately, Linux and FOSS are on our side as we wage the battle against those who try to steal our secrets and invade our systems.
Checking your system's security is best done sooner rather than later. Test the locks with our article on security verification; find out how to use PAM to help secure your systems; use MinorFS and AppArmor to implement discretionary access control; learn more about Samba security in part III of our series; use Darknet to help detect bots and secure your systems; use the Yubikey to increase your site's security; and don't forget to lock the doors, because a cold boot attack could render your security useless if somebody has physical access to your computer.
But, we're not just about sowing the seeds of fear. We also show you how to use memcached in Rails, how to manage multiple servers efficiently, how to deploy applications easily with Capistrano, how to manage your videos with MythVideo, how to mix it up a bit (your audio that is), and even play a few games.

Delicious
Digg
Reddit
Newsvine
Technorati




