Structuring XML Documents
Author: David Megginson
Publisher: Prentice Hall
Price: $44.95 US
Reviewer: Terry Dawson
Take a close look at any of the various documentation projects operating within the Linux community, and you will find SGML. The Linux Documentation Project, the Debian Documentation Project and others are using SGML as the primary tool in producing consistently structured and styled documentation. The search for a more sophisticated replacement for HTML has led to the development of XML, which is based heavily on SGML. XML has nearly all of the power and features of SGML, but will probably be much better supported because of the web-driven market for browsers and editors. For this reason, XML will probably replace SGML in many applications.
SGML and XML both provide a means of describing the structure of a document. SGML and XML rely on definitions called DTDs, Document Type Definitions, that describe document structure.
In this book, David Megginson competently explains the process of good quality DTD design. While the title suggests it is XML-specific, it is not. SGML and XML have so many similarities that it is possible to describe both simultaneously, highlighting differences between the two where they arise. This book is another in the Charles F. Goldfarb series, and in it, Mr. Megginson describes document structuring using both SGML and XML, managing to avoid confusing the reader during the process. The book has four main parts, and includes a CD-ROM with software that implements XML parsers, and a selection of modern and popular DTDs.
Part One provides some background on XML, describes how it is different from SGML and examines five popular and useful DTDs. This chapter isn't for people with no prior SGML or XML experience and isn't designed to teach you either, but if you are familiar with at least one of them, it will assist you in learning about the other. The chapter on DTD syntax clearly illustrates the differences and similarities between the two. DTDs examined in detail are:
Text-Encoding Initiative (TEI)
HyperText Markup Language (HTML 4.0)
The first four of these are in common use and have inspired many other DTD designs. The CALS table design, for example, has been borrowed many times and used in other DTDs.
Part Two covers the principles of DTD analysis. The core material of the book begins in these chapters. They describe how to critically analyse a DTD from three important perspectives: ease of learning, ease of use and ease of processing. The ease with which a particular DTD can be learned is critically important in having a DTD accepted by authors. If the DTD is difficult to learn, authors will tend not to use it, use only a small subset of it, or worse, misuse it by bending it to suit their needs. Mr. Megginson describes how to analyse the ease of learning of a DTD with the aim of instructing you how to design easy-to-learn DTDs.
The chapter entitled “Ease of Use” describes how to analyse a DTD to determine if it will be easy for authors to use when they are writing their documentation. Some of the issues explored are the naming of tags and attributes, when to use a new tag and when to add an attribute to an existing tag, and structural issues that can simplify or complicate an author's job.
The chapter on ease of processing is of particular interest to those who publish and develop processing tools. A DTD may be easy for the author to learn and use, but this doesn't always translate into something that is easy to process into printed or published form. The lessons are mostly common sense applied to the specific task of DTD design.
The third part of the book covers a number of advanced DTD maintenance and design issues. It will be of interest mostly to people who intend to use SGML or XML for purposes other than publishing, such as database systems or other information management applications. The first topic covered is that of DTD compatibility. I mentioned earlier that the CALS table design had been borrowed for use in other DTDs. When DTDs are similar, it is fairly simple to translate a document from one DTD to another. This is very useful if you wish to exchange documentation with a group which has a different DTD. This chapter describes how to identify compatibility and the advantages of keeping compatibility in mind when designing a DTD.
The second topic extends this discussion to exchanging document fragments. A document fragment might be a single chapter or paragraph from a book. If you wish to share portions of a document with a group using a different DTD, you will find useful tips in this chapter on ways to simplify the task.
The final topic in this section is DTD customisation. DTD customisation is the process of taking an existing DTD and modifying it to suit your specific purposes. Designing a sophisticated DTD can be a complex task. Often, there is little reason to design a DTD from scratch; an existing DTD may provide 95% of what you need, requiring only a small amount of customisation to fully meet your needs. This can save a lot of time and provides advantages in terms of document exchange and compatibility. This chapter describes how to customise DTDs, and how to design DTDs that are easy to customise. The DocBook DTD, for example, was designed with hooks in place that allow for easy customisation.
The fourth and final part of the book covers DTD design using a technique called Architectural Forms. Architectural Forms allow DTD designers to specify the method by which their DTD should be translated into one or more other DTDs. Architectural Forms allow you to write documents which are simultaneously valid for a number of different DTDs. This section of the book describes the concepts and the implementation of Architectural Forms and offers useful hints and advice to designers wishing to use this facility. I found this part of the book a little difficult to comprehend, but that was almost certainly due to my limited exposure to applications requiring use of this advanced technique. I'm confident that anyone with an application for Architectural Forms will find the information presented to be a good introduction to the topic.
I am pleased to report that the CD-ROM includes Linux versions of the XML parsing software. Two XML parsers are provided. The first is a precompiled version of the popular “SP” parser. The second is a Java-based XML parser called “Aelfred”. Each DTD described in the book is included in its SGML form, as well as a number of links to useful resources on the Internet. The CD-ROM provides some HTML-based documentation, but is otherwise not well documented. I am left with the impression that the CD-ROM was a last-minute addition to the book; nevertheless, it does provide tools to allow the reader to experiment with the techniques described, and to that end it is adequate.
I found Structuring XML Documents to be an interesting and informative book that I will certainly be using as a reference in the future. David Megginson has done a nice job of concisely capturing a lot of material while keeping the pace slow enough to allow one to absorb the information fairly comfortably. The book is ideal for both SGML or XML designers, and SGML designers should not be misled by the title. I recommend that anyone with an interest in DTD design, especially those involved with Linux-related documentation projects, take a look at this book. It is certain to be of assistance in your efforts.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- The Secret Password Is...
- RSS Feeds
- New Products
- All the articles you talked
1 min 13 sec ago
4 hours 25 min ago
- Keeping track of IP address
6 hours 16 min ago
- Roll your own dynamic dns
11 hours 30 min ago
- Please correct the URL for Salt Stack's web site
14 hours 41 min ago
- Android is Linux -- why no better inter-operation
16 hours 57 min ago
- Connecting Android device to desktop Linux via USB
17 hours 25 min ago
- Find new cell phone and tablet pc
18 hours 23 min ago
19 hours 52 min ago
- Automatically updating Guest Additions
21 hours 1 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?