Embperl: Modern Templates

Mr. Lerner introduces us to a template system for Perl: what it is, how it works and how to use it.

Earlier this year, I described mod_perl, a module for the Apache web server that embeds a full version of Perl inside Apache. Not only does this allow you to write CGI-style programs that overcome CGI's bottleneck problems, but it also gives you access to Apache's innards, letting you configure your server in many new ways. A number of developers have begun to take advantage of this flexibility, configuring Apache in new and clever ways.

One such clever idea is Embperl, written by Gerald Richter (richter@dev.ecos.de). Embperl allows you to create hybrid pages of HTML and Perl. As we have seen in several previous columns, templates allow designers and programmers to modify their respective parts of a web site without getting in each other's way. If the programmer wants to modify the logic, he or she can do so by modifying the Perl parts of a template. By the same token, designers can modify the look and feel of a page without having to ask the programmer to change a few print statements in a CGI program.

Embperl is but one of several template systems available for mod_perl. Another contender for this role is ePerl, about which I have read quite a bit, but haven't yet had a chance to try. Another solution, which uses Perl but doesn't depend on mod_perl or Apache, is Text::Template, a module I have used in previous columns when discussing templates. Finally, PHP is an embedded scripting language that resembles C and Perl in many ways, and is designed to be interspersed with HTML inside of documents. To find more information about all of these, including URLs, see Resources.

How does Embperl work?

Before we can use Embperl, it's important to understand how HTTP requests and responses are formed, and how a web server performs its job. When you click on a web page link, your browser connects to the host name in the URL and sends a short request to the server. The request consists of a verb (typically GET or POST), the name of the document being requested, and the version of HTTP that the browser supports. For example, to request the root document from a web server, a browser will typically send

GET / HTTP/1.0

to the server. It is the server's responsibility to handle the request, responding with an error message or a document. Depending on which version of HTTP the browser is running, the server might return multiple documents in the same response, demand some sort of user authentication before continuing, or redirect the user's browser to a different URL.

In many cases, though, the server will not return a document at all. Instead, it will run a program, returning the program's output, rather than its contents. This is how CGI programs work: the server is configured such that all files in a certain directory are treated as programs, rather than documents to be retrieved verbatim. (Indeed, security concerns arise when users can retrieve programs' contents, rather than seeing their output.) As far as the browser is concerned, it requested a document and received one in response. The magic happens on the server side, where the program is executed and produces its output.

A price is paid for CGI programs, above and beyond their execution times: because web servers fork a separate process for each CGI program, and Perl (and other popular scripting languages) can have a long start-up time, it often takes longer for the program to get started than for it to actually run.

For this reason, each web server has developed its own native API that allows programs to bind more closely to the server's internal code than would be possible with CGI. Netscape's NSAPI and Microsoft's ISAPI are two examples of such proprietary systems, and Apache's mod_perl is an example of how similar functionality can be given to Perl programmers. With mod_perl installed in your server, operations speed up tremendously, because the server compiles the program once, rather than each time it is run. In addition, because the program never requires creating a separate process, the overhead associated with executing such programs is relatively low.

Mod_perl is perhaps best known for allowing programmers to write very fast CGI-like programs. However, since Apache's internals are available via mod_perl, it is possible to write Perl programs that change one or more steps in Apache's processing of outgoing documents. These can range from the mundane to the fancy; in Embperl's case, we are setting a special PerlHandler for particular documents. In the Apache world, a “handler” is a program that does something special with the files in a directory before returning them to the HTTP client. You can think of a handler as a middleman between Apache and the file; the handler grabs the file and modifies it as necessary, handing the finished product to Apache. Apache then takes this finished product and returns it to the user's browser in the HTTP response.