autoSql and autoXml: Code Generators from the Genome Project

These tools have saved us from the drudger of writing tens of thousands of lines of repetitive code—we hope you find them useful.
autoXml Extensions and Limits

autoXml extends the type field of ATTLIST to include INT or FLOAT for numerical rather than string values. Similarly you can use #INT or #FLOAT in place of #PCDATA to put a numerical type in the text field. If you include these extensions, please use the .dtdx rather than .dtd suffix on your DTD file.

Currently autoXml only copes with DTD comments if they start on a line by themselves. autoXml expects all ELEMENTS and ATTLIST declarations to fit on a single line. It doesn't handle reference data types beyond saving the reference ID as a string.

Listing 3. autoXml Code Generation

Refer to Listing 3 for a complete example of the source code autoXml generates. In addition to the .h file shown in Listing 3, autoXml generates a corresponding .c file as well. Each XML file has to have a root object. In this case the root object is POLYGON (our DTD as is won't let us have more than one polygon per file). You can read an XML file that respects this DTD using the polyPolygonLoad() function, and save it back out using the polyPolygonSave.

autoSql and autoXml work well on a range of data, as you've seen, anywhere from an address book to gene tracks. We hope you'll find these tools useful on your own projects.


Jim Kent, PhD, and his work on the Human Genome Project have been profiled in the New York Times, the San Francisco Chronicle, Software Development magazine and other publications. He is currently working on cross-species genomic comparisons and Parasol, a job controller for his kilocluster.

Heidi Brumbaugh ( has been a writer and editor in the computer publishing industry since the late eighties. Visit links to her projects and read some of her fiction at


One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix