XML & DocBook: Structured Technical Documentation Authoring

An introduction to XML and DocBook: what is it and why should I learn yet another data format?
General Guidelines for Writing Content

If you are an author, it is okay to skim or skip the above technical explanations, although it helps if you have an idea about the process. If anything, it ensures that you don't need to bother with formatting and layout, because that all is done after you've written and sent the content.

What you should be concerned about as the author of technical documentation is restricted to things such as choosing the subject and scope of the document, developing a good outline or plan that guides you through the writing process, researching and checking the accuracy of your resources and statements and the like. For easy maintenance, it also might be a good idea to use a versioning system such as CVS. Versioning systems allow you to keep track of changes and to do a roll-back or restore in case of trouble. They also help if multiple people are working on the same set of files.

Once you know what to write and have installed all the tools and subsystems, the time has come to pour your content into an XML file. Whether you have chosen to use an XML editor or a plain-text editor, it is good to have an idea about the structure of an XML document and about the tags available in the standard DocBook vocabulary. Thus, if you want to represent an entity or object, such as a screenshot or terminal input, you know which tags to select or, at least, where to look for an overview of the possibilities. The full element reference can be consulted here. You also can check which tags are allowed to be included within other tags; parent tags are listed along with all possible child tags.

Writing Articles, Manuals and FAQs

All DocBook XML files start with a declaration of the XML type being used, specifying the XML version, character encoding, document type and location of the DTD. This article you are reading now, for instance, begins with the following lines:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

Documents that contain only a few pages worth of data usually are put in an article structure using the <article> </article> tags around the text that follows the declaration. A simple sample article is represented by the following code:

Listing 1. Example of a Simple Article

    <title>Example article</title>
      <address><email>spam.me@my domain</email></address>
      <para>This is an example article demonstrating simple DocBook XML

  <section><title>This is the first section</title>
    <para>It is a very short section containing only one paragraph, enclosed 
by para tags. The section has a title enclosed by title tags and is in 
turn enclosed by the section tags.</para>

  <section><title>This is the second section</title>
    <para>It also has only one paragraph.</para>

In Listing 1, the first line after the declaration contains the opening article tag. The line after that begins the information about the article, including the title, author, affiliation, publication date and abstract tags. The title tag can be given as a child to many other tags. Next comes the author information, which allows for specifications of name, e-mail address and company or organization. Apart from company or organization, you also can specify organizational divisions, job titles and so on. After the author information comes the publication date, a section that also can contain remarks, trademarks, links and more. Next comes the abstract, a short description of the document's content.

Once the introductory information is supplied and the document has moved on to the main content, divisions in the document can be marked by sections and subsections. Sections and subsections have titles and consist of paragraphs. Normal text usually is enclosed by paragraph tags.

Other types of content can be added using a variety of tags. The DocBook DTD consists of over 300 tags altogether. Among the more commonly used are:

  • itemizedlist or orderedlist parents and listitem children, used to create lists such as the one you currently are reading

  • figure and informalfigure parents with mediaobject, imageobject and textobject children, used to include graphics

  • screen and programlisting tags, used to display terminal output, eventually using another type or size of font in the converted files

  • command, option, parameter and application tags, used to specify the type of entity between these tags, which usually results in italic or bold rendering or a different size or type of font

  • qandaset parents with qandaentry, question and answer children, used in a FAQ list

  • sect1 parents with sect2, sect3 and sect4 children, used to specify subsections

  • table parent tags with row and entry children, used to specify table rows and columns

Examples of these tags and many more can be found in the DocBook Element Reference and in LDP Author Guide, which contains templates for different types of documents.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Thanks a trillion!!

Anonymous's picture

Thanks a lot... This is indeed a great documentation... helped me a lot ... !!!!

Sample code won't compile! ;-)

thundt's picture

The example in Listing 1 has an error: It is missing the </author> tag.


Red-3's picture

I had to laugh (out loud and for a very long time) when I read this statement half-way through this article:

"General Guidelines for Writing Content

If you are an author, it is okay to skim or skip the above technical explanations..."

The author then goes on to give advice on how to write a good, well structured document. I would hate to be an author, having gone through all the technical explanations, only to read that it was okay to "skim or skip" the stuff I had already trawled through for half an hour!
Surely a statement like this would have been much more useful at the start of the document...

Oh the mindset of the developer - details first, usability second. ;)
Nice one!

Re: XML & DocBook: Structured Technical Documentation Authoring

Anonymous's picture

Does anyone know a good Docbook editor. As far as I am concern, it is not obvious to write a big document under vim :)

The one I know:
Conglomerate : a gnome XML (Docbook) editor
Butterfly : a java XML editor
Jaxe : another java XML editor

The first is the best...
Any other ?

Re: XML & DocBook: Structured Technical Documentation Authoring

Anonymous's picture

XML Mind is a nice and easy WYSIWYG editor.

Re: XML & DocBook: Structured Technical Documentation Authoring

Anonymous's picture

Re: XML & DocBook: Structured Technical Documentation Authoring

Anonymous's picture

XML is Extensible Markup Language, not Extended Markup Language.


Need for a stylesheet catalogue

Anonymous's picture

Thanks for the great howto.

I have used docbook a little and I always get annoyed with the rather plain results of the default stylesheets. I see docbook written material in books and web sites that look good but I do not have the time to learn all the stylesheet stuff to set up my own.

Is there a catalogue of stylesheets/css for docbook somewhere. If not I think it would be a good idea. I think that more people (myself included) would make more use of docbook if it was easier to get nice looking final format results.


Re: Need for a stylesheet catalogue

Anonymous's picture

Whilst I generally don't get on with LaTeX too well, I do find that it produces great looking output. Therefore, I tend to convert DocBook files to LaTeX (you can get XSL files which will do this) and then use either latex itself or pdflatex to convert to a printable format. For the XSL files, start at http://db2latex.sourceforge.net/

Re: XML & DocBook: Structured Technical Documentation Authoring

Anonymous's picture

Does anyone know of an XML format for storing or creating exams? I am interested in online and paper tests, and I am beginning to believe that XML would be a natural format to store tests in. Is there already a standard defined for this usage?

Re: XML & DocBook: Structured Technical Documentation Authoring

Anonymous's picture

For assessment/test information in XML you might be in interested in the "IMS Question & Test Interoperability Specification" from the IMS Global Learning Consortium


Re: XML & DocBook: Structured Technical Documentation Authoring

Anonymous's picture

This might be one of those situations where creating your own DTD for a test would be applicable. Then write your exams based on that DTD. I am pretty sure you can then use the same xml/xsl tools to generate your .html, .ps, .pdf, etc. files.

send me a linux project

kuldeep's picture

pleaz send me a linux project about any new topic