CGI Programming

So you're gathering information from your surfers; what now?

This time, we are going to look at one of the most common things that people want their CGI programs to do, namely save data to files on disk. By the end of the column, we will have accumulated enough tools to produce a simple, but functional guest-book program that will allow visitors to your site to save comments that can be read by others.

For starters, let's look at a simple HTML form that will allow users to send data to our CGI program, which we will call “entryform.pl”:

<HTML>
<Head>
<Title>Data entry form</Title>
</Head>
<Body>
<H1>Data entry form</H1>
<Form action="/cgi-bin/entryform.pl"
method=POST>
<P>Name: <input type=text name="name"
value=""></P>
<P>E-mail address: <input type=text
name="email" value=""></P>
<P>Address: <input type=text
name="address" value=""></P>
<P>Country: <input type=text
name="country" value=""></P>
<P>Male <input type=radio name="sex"
value="male">
Female <input type=radio name="sex"
value="female"></P>
<input type=submit>
</Form>
</Body>
</HTML>

Of course, an HTML form won't do anything on its own; it needs a CGI program to accept and process its input. Below is a Perl5 program that, if named “entryform.pl” and placed in the main “/cgi-bin” directory on a web server, should print out the name-value pairs that arrive from the above form:

0    #!/usr/local/bin/perl5
1    # We want to use the CGI module
2    use CGI;
3    # Create a new CGI object
4    my $query = new CGI;
5    # Print an a appropriate MIME header
6    print $query->header("text/html");
7    # Print a title for the page
8    print $query->start_html(-title=>"Form
     contents");
9    # Print all of the name-value pairs
10   print $query->dump();
11   # Finish up the HTML
12   print $query->end_html;

Here's a quick run-down of what each line of code does:

Line 0 tells a Unix box where to find the Perl interpreter. If your copy of Perl is called something else, you need to modify this line.

Without explicitly importing the CGI module in line 2, Perl wouldn't know how to create and use CGI objects. (Trying to use code from a module you haven't imported is guaranteed to confuse Perl and generate error messages.) We then declare $query to be an instance of CGI (line 4).

We then tell the user's browser that our response will be HTML-formatted text, and we do that by using a MIME header. The lack of a MIME header is the most common reason for a 500 error; whenever one of your CGI programs produces one of these, make sure that you aren't trying to print HTML before the header! Note that line 6 is equivalent to saying:

print "Content-type: text/html\n\n";

which also tells the browser to expect text data formatted in HTML. In general, though, I prefer to use the CGI object for readability reasons.

Line 8 creates the basic HTML necessary to begin the document, including giving it the title, “Form contents”.

Line 10 uses the CGI object's built-in facility for “dumping” an HTML form's contents in an easy-to-read format. This allows us to see what value was assigned to each of the elements of the HTML form, which can be invaluable in debugging problematic programs. For now, though, we are just using the CGI “dump” method to get ourselves started and confirm that the program works.

Saving the Data to a File

Now that we have proven that our HTML form is sending data to our CGI program, and that our program can send its output back to the user's web browser, let's see what we can do with that data. For starters, let's try to save the data from the form to a file on disk. (This is one of the most common tasks that clients ask me to implement, usually because they want to collect data about their visitors.)

#!/usr/local/bin/perl5
# We want to use the CGI module
use CGI;
# Set the filename to which we want the elements
# saved
my $filename = "/tmp/formcontents";
# Set the character that will separate fields in
# the file
my $separation_character = "\t";
# Create a new CGI object
my $query = new CGI;
# ----------------------------------------------
# Open the file for appending
open (FILE, ">>$filename") ||
        die "Cannot open \"$filename\"!\n";
# Grab the elements of the HTML form
@names = $query->param;
# Iterate through each element from the form,
# writing each element to $filename. Separate
# elements with $separation_character defined
# above.
foreach $index (0 .. $#names)
{
        # Get the input from the appropriate
        # HTML form element
        $input = $query->param($names[$index]);
        # Remove any instances of
        # $separation_character
        $input =~ s/$separation_character//g;
        # Now add the input to the file
        print FILE $input;
        # Don't print the separation character
        # after the final element

print FILE $separation_character if
                ($index < $#names);
}
# Print a newline after this user's entry
print FILE "\n";
# Close the file
close (FILE);
# -----------------------------------------------
# Now thank the user for submitting his
# information
# Print an a appropriate MIME header
print $query->header("text/html");
# Print a title for the page
print $query->start_html(-title=>"Thank you");
# Print all of the name-value pairs
print "<P>Thank you for submitting the ";
print "form.</P>\n";
print "<P>Your information has been ";
print "saved to disk.</P7gt;\n";
# Finish up the HTML
print $query->end_html;

The above program is virtually identical to the previous one, except that we have added a section that takes each of the HTML form elements and saves them to a file. Each line in the resulting file corresponds to a single press of the HTML form's “submit” button.

The above program separates fields with a TAB character, but we could just as easily have used commas, tildes or the letter “a”. Remember, though, that someone is eventually going to want to use this data—either by importing it into a database or by splitting it apart with Perl or another programming language. To ensure that the user doesn't mess up our database format, we remove any instances of the separation character in the user's input with Perl's substitution(s) operator. A bit Draconian, but effective!

One of the biggest problems with the above program is that it depends on the HTML form elements always coming in the same order. That is, if you have elements X, Y and Z on an HTML form, will they be placed in @names in the same order as they appear in the form? In alphabetical order? In random order? To be honest, there isn't any way to be sure, since the CGI specifications are silent on the matter. It's possible, then, that one user's form will be submitted in the order (X, Y, Z), while another's will be submitted as (Y, Z, X)—which could cause problems with our data file, in which fields are identified by their position.

A simple fix is to maintain a list of the fields that we expect to receive from the HTML form. This requires a bit more coordination between the program and the form, but given that the same person often works on both, that's a minor concern.

First, we define a list, @fields, near the top of the program. This list contains the names of all of the fields that we expect to receive, in the order that we expect to receive them:

my @fields = ("name",
                     "email",
                     "address",
                     "country",
                     "sex");

Next, we change the “foreach” loop (which places the field elements in the output file) such that it iterates through the elements of @fields, rather than @names.

foreach $index (0 .. $#fields)
{
  # Get the input from the appropriate HTML form
  # element
  $input = $query->param($fields[$index]);
  # Remove any instances of $separation_character

  $input =~ s/$separation_character//g;
  # Now add the input to the file
  print FILE $input;
  # Don't print the separation character after the
  # final element
  print FILE $separation_character if
        ($index < $#fields);
}
______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix