Writing Man Pages in HTML
The Web invasion of the Internet continues at a breakneck pace. In the space of just a few years HTML has become one of the most widely supported document formats. Currently, there's gold-fever driven effort by a cast of thousands to reformat older sources of information for presentation on the Web. The required technology is relatively simple to understand, and is inexpensive to assemble. Linux is an ideal development and delivery environment: low cost, reliable, and necessary software is included in several competing Linux distributions.
In this article I'll discuss a solution I've written for serving up old documents in the new medium—automated translation. The documents to be converted are the Unix man pages. Man pages are highly structured documents in which major headings and references to other documents, are easily recognized. The files making up the man page system are already organized into a rigid set of hierarchies using a formalized naming system; thus, it is easy to deliver the entire existing man system and document format via the Web without reorganization of content or overall structure. I've assembled some pieces of software that translate the old Unix manual format into HTML while preserving the old style and organization. The technologies present in the Web allowed further enhancements: documents can be cross-linked; alternate forms of indices can be automatically generated; and full text searches are possible.
I've called the package I've assembled vh-man2html. It is designed to be activated as a set of CGI (Common Gateway Interface) scripts from a web server—the same technology that drives HTML forms. This means that vh-man2html can be used to serve man pages to other hosts on your LAN or on the Internet. The principle component of vh-man2html is Richard Verhoeven's man2html translator, augmented by several scripts to generate indexes and facilitate searches. vh-man2html also has some supporting scripts that allow you to drive Netscape from the Unix command line, so that vh-man2html can actually replace “man” on systems that include Netscape.
If you have access to the Web, you can see vh-man2html in action at:
where the man pages for Caldera 1.0 are available on-line.
Figure 1 shows the main vh-man2html web page which provides direct access to individual man pages or to three different kinds of indices: name-description, name-only and full text search. In the main window you can enter the man page name, man page name and section number, or narrow things down even further by specifying the hierarchy or full path name.
Figure 2 shows a man page generated by the man2html converter. The converter has translated the man formatting tags to approximately equivalent HTML tags. It has also created an HTTP reference and links for any references to other man pages. The converter also generates a subject heading index, which is useful when reading larger man pages. Text highlighting and font changes are correctly translated, and if tables are present, they will be translated into HTML tables. At this time, man2html doesn't translate eqn described equations, but since very few man pages use eqn, this is not a major drawback.
Figure 3 shows part of a name-description index for section 1 of the man pages. The index includes an alphabetic sub-index and links into the other sections of the manual. The name-only index in Figure 4 is similar to the name-description index, except that it is more compact.
Figure 5 shows the result of a full text search for pages containing a reference to “cdrom”. Links are generated to each page matched.
These figures illustrate one of the key advantages of serving up the man pages via the Web: a variety of access paths can be presented in an integrated form—a one-stop-shop interface.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Google's Abacus Project: It's All about Trust
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Seeing Red and Getting Sleep
- Fancy Tricks for Changing Numeric Base
- Secure Desktops with Qubes: Introduction
- Working with Command Arguments
- Secure Desktops with Qubes: Installation
- CentOS 6.8 Released
- Linux Mint 18
- The Italian Army Switches to LibreOffice
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide