LaTeX Equations and Graphics in PHP
It's safe to say that the world of Weblogs and wiki Web sites are here to stay. Although such systems are great for journals, general text posting and even photography, their limitations become apparent when working in environments that require the use of features more advanced than simple text entry and images. In particular, technical Weblogs need support for graphs, mathematical expressions, diagrams and more. Such functionality is difficult, if not impossible, to implement with HTML alone.
Using external applications such as dia, xfig and Microsoft Equation Editor is equally difficult, as the poster first must create the figure or mathematical equation and then upload an image representation to a Web site. Moreover, if other posters in a collaborative Weblog want to modify the figure, they also must possess the application as well as the original file that created the image. Obviously, this sort of system has its share of complications, and it fragments the overall quality of figures and equations for a site.
In this article, I demonstrate the use of LaTeX, a typesetting tool and language designed specifically for technical document preparation, from within PHP to address these demands. I call LaTeX from within PHP when HTML is not sufficient to address these complex needs and then render the result uniformly as a PNG image, a format all modern browsers support. Because the software is available entirely on the server, all posters and users have access to the same set of tools and packages for publication.
Why Not MathML?
According to the W3C, MathML is a low-level XML specification for describing mathematics. Although MathML is human-readable, in all but the simplest cases, authors need to use equation editors or other utilities to generate XML code for them. Moreover, modern browsers support only a limited subset of the MathML language, and even then, many of these browsers require external plugins to support MathML. Although the future is quite promising for this language, as of now, it essentially is unsupported and unusable.
To complicate matters further, Leslie Lamport's LaTeX typesetting system has become the de facto standard for the production of technical and scientific documentation. Based on Donald Knuth's TeX document layout system from the early 1970s, LaTeX has been around since 1994 and is a mature and well-understood technical documentation preparation platform with a committed user base. That's not to say that learning LaTeX is a walk in the park. It certainly isn't, but as of now, MathML does not provide compelling evidence to warrant a transition from this already-established system.
Following the UNIX philosophy to “write programs to work together”, I use a composition of common tools available for the Linux platform and chain them together to produce a PNG-equivalent rendering of the LaTeX source. Specifically, you need a recent version of LaTeX with dvips and the ImageMagick toolkit. You are going to use the convert utility from the ImageMagick tools to convert your result into a PNG image. Luckily, most hosting providers that provide shell access already have these utilities available.
The rendering system takes a string of text and extracts segments enclosed in [tex] and [/tex] pairs for future substitution. These extracted segments are called thunks. If a thunk previously has been processed, meaning an image representation of the thunk code already is available, the thunk is replaced with a URL to that image. If the thunk is new, it is passed to the LaTeX typesetter, which outputs its result as a DVI file. The DVI file then is converted to a PNG image with ImageMagick and placed into the cache directory. A URL of the newly created image is substituted for the thunk in the original text. When all thunks have been processed, the resulting text is returned to the caller. The process for converting a single thunk is illustrated in Figure 1.
I think it is best to start top-down and first look at how to invoke the rendering process, without discussing implementation specifications. The driver is simply an HTML front end that provides a mechanism for testing the LaTeX rendering system. It allows you to see how the render class should be invoked. To get you started, I've provided the basic template shown in Listing 1.
Practical books for the most technical people on the planet. Newly available books include:
- Agile Product Development by Ted Schmidt
- Improve Business Processes with an Enterprise Job Scheduler by Mike Diehl
- Finding Your Way: Mapping Your Network to Improve Manageability by Bill Childers
- DIY Commerce Site by Reven Lerner
Plus many more.
- Non-Linux FOSS: Snk
- Building a Multisourced Infrastructure Using OpenVPN
- diff -u: What's New in Kernel Development
- Server Hardening
- 22 Years of Linux Journal on One DVD - Now Available
- Giving Silos Their Due
- Controversy at the Linux Foundation
- Don't Burn Your Android Yet
- What's New in 3D Printing, Part III: the Software