Keeping Programs Trim with CGI_Lite

This article is an introduction to a second CGI module, CGI_Lite, that is not as feature rich as CGI but is much smaller in size and therefore more efficient for use in small-scale projects.
Getting the Query String

While GET can be used to send name,value pairs, it is often used to send simple text strings. For example, it might be nice to send a name without assigning any value, as in:

http://www.fictional.edu/cgi-bin/hello.pl?J%20Edgar

This technique is often useful when CGI programs have to receive a user's unique ID in a relational database running on the web server. If we send the identifier as part of the query string, the program can grab that value and use it as the index into a table in the database, producing a personalized home page or otherwise unique output customized for the user.

Several on-line booksellers use this method. When I go to Amazon.com to check the status of my latest order, I go to a URL that looks like:

https://www.mybookstore.com/cgi-bin/order.pl?1234-5678-9012

What I would like is a simple way of retrieving this string. CGI.pm allows you to get the string by pretending that the contents of the query string are assigned to the variable named keywords, so if we are using CGI.pm, we can type:

my $id_number = $query->param("keywords");
Now, the variable $id_number contains the value “1234-5678-9012”.

If we are using CGI_Lite.pm, things get a bit more complicated, because the module expects the query string to only be used for sending name,value pairs, not individual text strings. Thus, when we send the above query string, CGI_Lite.pm assumes that what we are actually setting the form element named “1234-5678-9012” to a null value—not quite what we might expect, but something which we can manage.

One possible method is to load parse_form_data to turn the received name,value pairs into a hash. The hash contains a single key, corresponding to the data that was passed in the query string, which CGI_Lite.pm thinks is a variable name. We can then retrieve that key by getting the list of keys in our hash. Listing 4 is code that accomplishes that feat.

Listing 4. Using Hash Keys

This is not the most efficient way to get the information, but it does do the trick. We could simply read the information from the QUERY_STRING environment variable—but that would introduce another problem, namely, the translation of characters sent in percent-hex encoding. By using the built-in facilities of CGI_Lite.pm, we ensure that the translation is done correctly.

If you find this somewhat confusing, you're not alone. Many of my own programs take advantage of the query string, and having to pretend that my data is really a variable name strikes me as a bit odd. Perhaps a future version of CGI_Lite.pm will handle this, although adding too many features would eventually turn it into CGI_NoLongerLite.pm.

Debugging

Debugging CGI programs is often difficult because the execution takes place behind the scenes. In contrast with more typical programs, which allow us to interact with them while they are running, CGI programs are invoked by Web servers, with input coming from the user's Web browser (via the Web server, which hands that data to the program), and with output returning to the user's browser (again, via the Web server).

CGI.pm offers two good aids to debugging CGI programs. A dump method that prints out the contents of all HTML form variables as they are received, and a command-line interface that allows programmers to enter variable assignments without invoking the program from an HTML form.

In keeping with its light-weight philosophy, CGI_Lite.pm does not offer the command-line debugging interface, which might make debugging large programs difficult. However, it does offer a way to check the data that was received from the user's web browser. The print_form_data method sends all of the known name-value pairs to stdout. If your program does not work correctly and you want to check the values of the input data, you can add the following line to your program:

$query->print_form_data;
Which Should You Use?

With the above discussion in mind, which module should you use when writing your CGI programs? In most cases, I would tend to stick with CGI.pm, for a variety of reasons.

First of all, I tend to use CGI.pm's command-line debugging interface quite a bit, and the fact that CGI_Lite.pm lacks such an ability is a major hindrance for me. It is certainly possible to get around this problem, since I wrote CGI programs for a while before CGI.pm appeared on the scene, but it never hurts to have another debugging tool in your arsenal, particularly when writing large, complicated programs.

A second reason why I would tend to favor CGI.pm is because I often have to work with other people on projects, and using two different interfaces to the CGI standard might make life difficult for them. (We have enough problems as is; we don't also need to try to remember whether we should be using param or parse_form_data in order to retrieve information.)

Third, I find it useful to have extra functions that take care of the small parts of producing HTML. I used to constantly forget to put two newline (\n) characters at the end of MIME headers; with CGI.pm, I no longer have to remember.

At the same time, I find it somewhat irresponsible to write small CGI programs that use over 1MB of RAM before they even begin to perform any calculations or allocate any data structures. For small projects in which I want to use Perl (rather than a compiled language, such as C) but in which I want to maximize efficiency, I use CGI_Lite.pm. I also like the use of hashes, which strikes me as a natural way to store and retrieve form elements. Also, the fact that CGI_Lite.pm does almost everything I wish, including such advanced items as HTTP cookies and the uploading of files, makes it rather attractive for small-scale projects.

In an era of software bloat and programs that try to do more and more, it is refreshing to find a module that tries to do less and does it well. CGI_Lite.pm is not appropriate in all cases, but it is useful, well documented and efficient. If you are trying to squeeze the last few ounces of memory and CPU time from your web server, consider using CGI_Lite.pm in your next program—and enjoy the extra RAM for other projects.

Reuven M. Lerner is an Internet and Web consultant living in Haifa, Israel, who has been using the Web since early 1993. In his spare time, he cooks, reads and volunteers with educational projects in his community. You can reach him at reuven@netvision.net.il.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix