More Flexible Formatting with SGMLtools
In the October 1995 issue of LJ, Christian Schwarz presented a short overview of Linuxdoc-SGML as it stood then: a complete, out-of-the-box package that gave and still gives authors a chance to write once and present anywhere. From flat ASCII to typeset PostScript and hypertext HTML, it all rolls out from a single SGML source file. Since then, lots of smaller and bigger changes have resulted in renaming it SGML-Tools (and then SGMLtools—the hyphen caused confusion) to indicate it wasn't just for Linux anymore. Still, we, the SGMLtools project authors, weren't satisfied with this, so we set out to build an even better package that is presented here, SGMLtools 2. This article will give a brief overview of what happened to SGML-Tools 1 that led us to rename it SGMLtools 2; more extensive information can be found on the SGMLtools web site (see Resources).
A big issue that came up again and again was the fact that the shortcomings of the Linux document type definition were beginning to show. Document type definition (DTD) is the SGML term for the set of rules that fixes how an SGML document that is compliant with DTD must look. It outlines the structure of the document from titles and subtitles to tables; everything is defined.
Maintaining a document type definition, as we found out, is quite difficult. Constant discussion took place over which features should be allowed, how to make existing features better, whether to stick with pure procedural markup or be a little bit pragmatic about things. Endless rounds of talks came up and came back and began to interfere with progress. The Linuxdoc DTD was clearly too limited, but we didn't want to redesign it without finding out whether alternatives already existed.
We quickly came to the conclusion that the DocBook DTD, as developed by the Davenport Group, would be a good successor to the Linuxdoc DTD. DocBook, being developed by professionals for professionals with an emphasis towards technical documentation, fits the target audience for SGMLtools very well and solves a number of the problems of Linuxdoc. Furthermore, almost every SGML vendor supports DocBook, so this would make users less dependent on us and give them more ways to process SGML documentation. Recently, responsibility for maintaining DocBook has been transferred to the Organisation for the Advancement of Structured Information Standards (http://www.oasis-open.org/), ensuring that DocBook will continue to be widely supported.
The acronym DSSSL may not say much to the average reader, but it stands for another significant change in SGMLtools. DSSSL (Document Style and Semantics Specification Language) is a language used to specify how SGML documents will look. It helps in translating procedural markup such as “section” to a certain formatting style like “Helvetica Bold, 18 points”, building up tables of contents and more. It is much more powerful than the mapping files used previously, because it can act on context and allows you to define functions. As DSSSL is based on Scheme, you can do just about anything you wish.
We chose to use DSSSL not only because of its power, but also because it is an industry standard (contrary to the old method and to alternatives we evaluated). Also, it helped us jump-start the project because a complete set of DSSSL styles for the DocBook DTD is available.
SGMLtools 2 is a collection of tools based around three core elements:
the DocBook DTD
the standard DocBook DSSSL files
Jade, the SGML/DSSSL parser
When you hand your SGML source to SGMLtools (with the command sgmltools), it basically does nothing but call Jade with the name of the SGML file, the name of the DSSSL file to apply to it and the requested output format. The following sections go into some detail in order to make the process clear. It is not difficult to understand, and it helps a great deal when you want to make modifications to have some basic knowledge of what happens during a run of SGMLtools.
Jade first reads the SGML file and tries to find the document type definition from the SGML file's declaration at the beginning of the file. For example:
<!DOCTYPE article PUBLIC "-//Davenport//DTD DocBook V3.0//EN">
appears at the beginning of a DocBook-compliant document. (Note that article can refer to any part of the DocBook DTD, and para can be used to designate a single-paragraph document.) From the PUBLIC identifier, Jade obtains the file name of the DTD definition (see the sidebar on Public and System Identifiers), and if all this succeeds, the SGML source is checked for compliance.
After the document has been found to be okay (“validated”), Jade reads the indicated DSSSL file and executes it against the parsed SGML file. The DSSSL “program” reads the SGML document from objects in memory and outputs another memory structure called a Flow Object Tree (FOT). The FOT will look structurally like the SGML document, but it contains information on fonts, sizes, and other options. Finally, Jade hands the FOT to one of its backends which converts the generic-style information into the backend's specific file format.
As a short example to illustrate this process, start with an SGML document with the line:
<Sect1><Title>Introduction</Title> ...
This is a top-level section with “Introduction” as the title. Jade determines it is a valid DocBook document by reading a DSSSL file, perhaps ldp.dsl which gives instructions for Linux Documentation Project style formatting.
The following section could be in the DSSSL file:
(element SECT1 TITLE ((make paragraph
font-family-name: "Times New Roman"
font-weight: 'bold
font-size: 20pt))
This expression says “for TITLE elements within SECT1 elements, output a paragraph with a 20pt bold Times font”. Taking some shortcuts, we can say that this expression results in a flow object with the given properties and the text “Introduction” for content (the concept of making a paragraph out of everything, even headings, will be familiar to people who have worked with DTP [distributed transaction processing] software). When everything is done, Jade hands all the flow objects to the backend, for example, TeX. This backend, upon encountering the flow object for our introductory section title, will output something like:
{\setfontfam{Times-Roman-Bold}\setfontsize{20pt}Introduction}
which can then be processed by TeX and a special TeX package to
generate DVI and PostScript.
Note that the beauty of DSSSL is that you talk only about style, not about specific instructions for specific formats. Whether TeX, RTF or groff, you'll always get at least a close equivalent of a “20pt Times New Roman Bold” section header. If you need to tune this, you can easily override pieces of DSSSL specifications for specific backends. Often, you'll at least have different DSSSL files for hardcopy and HTML output.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- RSS Feeds
- What's the tweeting protocol?
- New Products
- Trying to Tame the Tablet
- Validate an E-Mail Address with PHP, the Right Way
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




1 hour 53 min ago
18 hours 42 min ago
21 hours 14 min ago
22 hours 32 min ago
23 hours 6 min ago
23 hours 29 min ago
1 day 4 hours ago
1 day 5 hours ago
1 day 6 hours ago
1 day 8 hours ago