FastCGI: Persistent Applications for Your Web Server

FastCGI allows Apache to run and manage persistent CGI-like scripts, overcoming CGI's worst shortcomings.
Writing Scripts

In many ways, writing FastCGI scripts is not very different from traditional CGI programming. You must specify a Content-type (typically, “text/html”) if you're providing content. You can use Location and Status to specify redirects or other HTTP messages. Also, you have normal access to the %ENV hash.

From within scripts, STDIN and STDOUT can be accessed, but only in standard ways. The FastCGI library manipulates those data streams quite heavily; you can print without trouble, but more advanced operations will fail. You can't, for example, send a reference to a typeglob (a symbol table entry) of STDOUT (\*STDOUT) to a forked process. In fact, FastCGI is fairly scornful of forking, and I haven't heard any reports at all from someone trying to run it on a thread-enabled version of Perl 5.005.

The main difference, structurally speaking, between CGI and FastCGI scripts is that the main body of code is placed within a while loop, one which hopefully never ends. The basic structure of a FastCGI script is pretty much the same regardless of its task:

  1. Initialize variables and connections to databases, daemons, etc.

  2. Do the loop.

  3. Provide for cleanup so you can exit gracefully when needed.

Although FastCGI will force few substantive changes in your code, it will likely change your perspective on what makes a good script. Some of the lessons I've learned while developing FastCGI applications are:

  • Think clean. Typical CGI scripts don't need to be excessively concerned with memory leaks or sloppy variable scoping. FastCGI scripts, since they're persistent, have to keep a tighter rein on things.

  • Think big. We're used to thinking of CGI scripts as fast one-timers that should define the fewest functions necessary to get the job done. With FastCGI, it's usually better to have lots of functionality in one script; you have easier access to shared data and fewer PIDs littering your process table. I try to use the main script (the one specified in httpd.conf) as a distribution center, jobbing out all the real work to modules. Doing so makes it easy to extend the main script's functionality with just an extra line or two of code; all your tweaking can be done on the module.

  • Think long-term. You want your process to keep running, so it's wise to not let your script die() or croak(). Catch the return value of any statement whose failure might prove fatal (such as open()) and rely on error messages and flow control to keep the loop running.

Ad Rotation Made Easy

Webmasters of commercial sites hate to admit it, but getting advertisements on-line is an increasingly unavoidable fact of the job. If you have multiple sponsors in a rotation, or if your sponsors each have multiple ads, there's no way to hardcode the ad into a page stored on disk. Of course, this is true for any information likely to be presented on a rotating basis: news, current specials or random links.

The rotate.fcg script shown in Listing 1 provides a bare-bones approach to meeting that need. It provides a persistent array of ad information that can be inserted wherever you choose on any disk-based document. It also allows the ad array to be updated without having to re-start the script (although this technique won't work if you're running multiple instances of the script).

Based on the Apache configuration shown in the Apache sidebar, the URL to invoke the script is http://www.yoursite.com/fastcgi-bin/rotate.fcg?page.html, where “page.html” is the name of a document into which you'd like to insert an ad. page.html can contain one or more instances of an HTML comment that serves as a placeholder for the ad:

<!-- Ad Here -->

Using an HTML comment in this capacity means that the document will display correctly, even if you have no ad to put there yet.

The script's opening section scopes and initializes all variables to be used for the life of the process. Three things are worthy of note in this section. First, since we initialize @ads outside the loop, it will stay persistent for the life of the script. Second, we need to initialize the %ENV array ourselves, lest we find it empty later on down the line. Third, we set $| to a non-zero number, because we want to flush STDOUT every time the script is invoked.

Right before the script enters the main loop, it initializes the array of ads by calling the initialize routine. This routine reads a text file of the sort shown in Listing 2. The data for each sponsor are temporarily put into the %sponsor hash, formatted into HTML and pushed into the @ads array. If the text file can't be opened, the routine returns an empty array, allowing the script to run anyway.

The main action takes place in the loop labeled REQUEST. The while command is the only place the script interacts explicitly with FCGI.pm. It's also the only substantive difference between a FastCGI script and a traditional one. Regardless of the language you use for FastCGI programming, a loop like this one will be the structure in which you frame the script's main process.

Once in the loop, the first task is to allow the webmaster to re-initialize the ad array on the fly. In the example script, this is accomplished by placing a request to http://www.yoursite.com/fastcgi-bin/rotate.pl?reload. To provide a little security, the script allows re-initialization only from the web server. If you're running multiple instances of a script, you'll have to accomplish this by some other means: restarting Apache with kill -USR1, reloading the data file if its timestamp has changed, etc.

If you used a script like this to run current news headlines, it would be easy to post new updates to your site several times each day by adding them to the text file and re-initializing the array.

The loop's second task is to make sure that the requested file can be opened. If it can't, the script calls a routine (not included in my example) that would send off a “File Not Found” message. By providing its own error message, the script can recover gracefully from a bad request without having to die off. If it is available, the requested document is assigned to the @doc array.

Next, an ad is pulled off the front of the @ads array, assigned to $ad, then pushed to the back of the array. The script retains a copy of the ad, even though it's been put back in the array.

Fourth, the script cycles through the document looking for any instances of <!-- Ad Here -->. When it finds one, it substitutes the $ad for it. If the text file containing the ads is empty or unopenable, or if the requested page has no place for an ad, no substitutions are made.

Finally, the script prints the appropriate HTTP header, sends off the document and heads back to the front of the loop to wait for the next request.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix