ATF Jubilee Edition
In the last few years, advanced web development has become a specialty of its own, requiring that programmers learn something about administering systems, networks and databases, while keeping in mind good programming and security practices.
There are three current trends in the world of web development which are beginning to improve things dramatically for users as well as for developers. When the three are used together, they are often called an “application server”.
The first trend is architectural, moving away from single-shot CGI programs and toward programs that are cached within the web server or another environment. The only reason we use CGI programs for creating dynamic content is that the web server itself cannot create our custom HTML files on the fly. We could theoretically write a new module for Apache in C, and compile that into our configuration—but that is far too much work in most cases, and the time savings is not worthwhile.
However, there is a middle ground between putting the custom code inside of the server and leaving it completely outside. What if we put an entire programming language inside of the server, making it possible for us to add new functionality in that language? If the language is interpreted, then we can modify and debug our new functionality without having to recompile or restart the server.
This is the idea behind mod_perl, which embeds a copy of Perl inside of Apache. It gives us a Perl-language interface to Apache's internals, making it possible to access and modify anything having to do with the request object. Everything that a C-language module can do for Apache can also be done inside of mod_perl, from creating custom response handlers to changing the way in which authentication is performed.
In stark contrast with CGI programs, where Perl compiles the program once, executes it once and exits, mod_perl caches a compiled version of the program and then executes that repeatedly. (This can sometimes cause extreme memory growth and requires that programmers be especially careful.)
While mod_perl was once the only embedded language module for Apache, others have come along recently. mod_snake does for Python what mod_perl does for Perl, making it possible to write custom Apache handlers in Python. There is even a mod_tcl, which provides embedded Tcl inside of Apache, although I am not aware of any sites that are using its capabilities.
Another open-source web server, AOLServer, has long contained an embedded Tcl interpreter. Tcl procedures can thus be used to create dynamic output, connect to a relational database and make code conditional—all within the server itself, without having to go to an external CGI program.
If you would prefer to use Python over Tcl, a beta version of PyWX (Python Web Extensions) recently became available. PyWX provides a Python API to all of the Tcl functions that AOLserver normally provides. While this makes PyWX incompatible with most of the Tcl code available for AOLserver, it does make it easier to perform certain functions, given the wealth of Python modules available on the Web.
The second trend involves embedding code inside of HTML. Microsoft's Active Server Pages are perhaps the best example of such a practice, but there are plenty of other ones as well. On Linux, we can choose from a variety of different systems, ranging from Java Server Pages (JSPs), HTML::Mason (which works with mod_perl), PHP and ADP.
I have worked with Java off and on since it was first introduced, and was long ago convinced that it would be nice to spend some time working with the language. Like many other people, I was turned off by the idea of applets, which were slow, insecure and buggy. However, in recent years, server-side Java has become increasingly prevalent. Each Java “servlet” is a class that runs inside of a Java Virtual Machine (JVM). Servlets can accomplish all of the things that we might want to do when generating dynamic content—they can talk to databases with JDBC, they can retrieve and modify HTTP headers and they can produce responses whose content depends on the user's preferences.
JSPs make it easier to work with servlets by assuming that everything is literal HTML except for what is contained within <% and %>. When a JSP is invoked from a web browser, the JSP is compiled on the fly into a Java servlet, which is in turn complied into a Java .class file. This .class file is loaded into the servlet engine, executed and kept around for future invocations. JSPs and servlets can use Java “beans”, objects that can be used to model persistent behavior and to implement the “business logic” that sits in the middle of most modern three-tiered web applications.
mod_perl is a very powerful tool for creating Apache handlers, but it can sometimes force you to work at too low a level. For this reason, a fair number of Perl modules exist that allow you to mix Perl code and HTML in some way or another. HTML::Mason, which I profiled in a series of articles earlier this year, is the system that I prefer because of its simple syntax and the way it allows templates to incorporate one another. While at YAPC::Europe in London this fall, I saw a demonstration of the Template Toolkit, which seems to be similar to HTML::Mason in its philosophy, except that it adds the notion of “plug-ins”.
While Java and Perl are general-purpose programming languages that are well-equipped for server-side web programming, PHP is a language designed expressly for creating dynamic web pages. PHP includes a large number of functions for working with a variety of different kinds of files, databases and Internet standards. Recent versions of PHP even allow you to work with Java objects, and a CORBA adaptor is expected to be released in the near future. At the same time, PHP requires recompilation every time you change the included feature set; there is no notion of dynamically adding or deleting modules from the system. If you install PHP before deciding that you want to work with PDF files, you may find yourself recompiling it simply to add such features.
Users of AOLServer have a similar system at their disposal, known as ADP (“AOLServer Dynamic Pages”. An ADP page allows you to mix Tcl with HTML, where the Tcl can use any of a number of special procedures that are defined within AOLServer. You can thus create an ADP page that retrieves information from a database, interprets the contents of an HTML page returned from another server or simply performs calculations based on a user's HTML form inputs.