Missing CGI.pm and Other Mysteries

CGI.pm, for all of its useful and amazing features, is just one of the many terrific Perl 5 modules that isn't included with the standard Perl distribution.
Guestbook Problems

I also received several notes from readers alerting us to two mistakes in the guestbook program in the January issue. Guestbooks, as we all know, generally contain more than one greeting from a user on a site. Thus, if we open the file using the Perl command:

   open (FILE, ">$filename") || &error_opening_file($filename);

we are asking for trouble, since the single > operator not only opens the file for writing, but destroys any information the file might have contained previously. The code should really have read:

   open (FILE, ">>$filename") || &error_opening_file($filename);
which means that we want to open the file named in $filename for writing, appending our new data to whatever might have been there before. Note that the >> operator creates a file if none existed before, so you should feel free to use >> for file creation and appending.

The other problem in that program, which was noticed by reader Bill Holloway, had to do with this section of code:

   @names = $query->param;
   # Iterate through each element from the form,
   # writing each element to $filename. Separate
   # elements with $separation_character defined
   # above.
   foreach $index (0 .. $#fields)
      # Get the input from the appropriate HTML
      # form element
      $input = $query->param($fields[$index]);
      # Remove any instances of
      # $separation_character
      $input =~ s/$separation_character//g;
      # Now add the input to the file
      print FILE $input;
      # Don't print the separation character after
      # the final element
      print FILE $separation_character
         if ($index

Of course, since we have imported the HTML form elements into the @names array, we have to read them out of @names, and not out of @fields, which is what the above code does. Thus the line:

   $input = $query->param($fields[$index]);
should be replaced with:
   $input = $query->param($names[$index]);
as you can see in the corrected version of the program, which appears in Listing 1.

Individual Users and CGI Directories

Another reader, Maro Shim (writing from Korea), noticed something concerning what I said in the February issue about having to add a ScriptAlias or Exec directive to the HTTP server's configuration file each time a new CGI directory needed to be added. Maro points out that this means an administrator has to modify the files for each individual user.

Let's get into the pros and cons of letting individual users have their own CGI directories, using Apache as an example. Then we'll discuss why this might not be the best thing to do. Finally, we'll discuss giving each user CGI access, but not giving them the run of the system.

Maro's suggestion is that administrators can create a symbolic link inside the cgi-bin directory (which is /home/httpd/cgi-bin by default for the copy of Apache running on my Red Hat Linux box), and that this link can point to a directory inside each user's public_html directory, which typically contains the user's HTML files.

For example, here is a listing of my personal home directory at the time of this writing:

   [1068] ~% ls  -F
   800omni.pdf    News/          public_html/
   Consulting/    Text/          response1.txt
   Development/   cgicyrcode.pl  test.dgs
   Mail/          chap4de.doc

Because I have used the -F option to ls, directory names end with slashes, which makes them easier to identify. You can also identify directories by color or boldface text if you use the --color option, but I'm too old-fashioned for that. The public_html directory is where my personal HTML files reside, which are available via a URL ending with ~reuven/, since my username is reuven, and the web server is configured to look in a user's ppublic_html directory. Thus, if there were a file index.html, it would be accessible via the URL:

(substituting an appropriate hostname for localhost, of course).

Personal HTML files are nice, and greatly reduce the amount of work that a system administrator must do in order to run a web server on which dozens, or perhaps hundreds, of users might want to put their own home pages. But what about CGI programs? That's where Maro's letter comes in: Inside the public_html directory we can create a subdirectory named cgi-bin, as follows:

   [1071] ~% cd public_html/
   [1072] ~/public_html% mkdir cgi-bin
   [1073] ~/public_html% ls -F
   cgi-bin/   test.html

Now the personal HTML directory contains two items—a file, test.html, which (in this case) can access ~reuven/test.html, and a directory named cgi-bin, the contents of which I can access as ~reuven/cgi-bin/. Remember, there isn't any magic in the name cgi-bin—at this point, it acts just like any other subdirectory. Indeed, if I were to place the CGI program elephant.pl inside ~reuven/public_html/cgi-bin, I could access it by going to:

But rather than seeing the results of executing elephant.pl, we will see its source code. This is true because we haven't told our server that it should execute the program; we need to explicitly install ~reuven/cgi-bin as a CGI directory. This is the most common way to create personal CGI directories. By including (under Apache) a ScriptAlias directive in the file srm.conf, we can create new CGI directories for each user on a system. Thus, if we were interested in turning ~reuven/cgi-bin into a CGI directory, we could use the line:
   ScriptAlias /~reuven/cgi-bin/ \
which would have the desired effect. However, this means that every time we wish to give a user a CGI directory, we need to modify srm.conf and restart our HTTP server.

Maro's alternative saves us this work by taking a different approach: Rather than add new ScriptAlias directives to srm.conf, we simply tell our HTTP server that it should follow symbolic links within the CGI directories that already exist, using the commands:

   <Directory /home/httpd/cgi-bin>
   AllowOverride None
   Options FollowSymlinks

Once we have done that, we can create symlinks to any directories that we want to turn into CGI directories. For example, to turn /home/reuven/public_html/cgi-bin/ into a CGI directory, we (as root, or another user with appropriate permissions) would only have to create the symbolic link:

   ln -s /home/httpd/cgi-bin/reuven \
which would then let us use:
which physically exists in my own personal directory, but which logically exists (as far as the HTTP server is concerned) in the /cgi-bin directory, which forces the server to execute it.

Before you turn on CGI directories for individual users, consider the ramifications: CGI programs are potentially an opening from the outside world into your server. If even one CGI program is written with malice aforethought, an attacker could gain access to your system—gathering information about your users, for example, or using that information to alter or damage files. It might seem convenient to give all users access to CGI programs, and it will certainly save you time in the short run, but the security implications are too serious to ignore.

If you cannot restrict CGI to a small subset of the users on your system, then you should consider installing a CGI wrapper program that performs safety checks before executing these programs. A CGI wrapper is a program which takes a CGI program as its argument. After the wrapper performs several security checks, it executes the CGI program—under the owner's ID, rather than the ID normally reserved for web programs. This prevents one CGI program from reading or changing another program's data—an increasingly possible problem as large numbers of unrelated sites are hosted on the same system.

One such wrapper, known as suEXEC, comes with Apache 1.2. Configuration and compilation of this program is relatively easy and is described in detail in the Apache documentation. Simply put, you compile suEXEC and set it to be SUID root, so it can change to the user ID of the user, regardless of who that owner might be. Finally, you will have to install the suexec program outside the normal CGI directory in a location defined in the httpd.h file in the Apache source code.

Another popular CGI wrapper is CGIwrap, which works in a similar way without being tied to a particular HTTP server. You can read more about CGIwrap at:


It is a good idea for these wrappers to run CGI programs under a user ID other than your HTTP server's default, letting individual users write and install various programs of their choosing, the possibility of sending programs data that can overflow buffers, or that might pass malicious arguments to programs using the Unix shell is too great to ignore, particularly with the security holes for which Unix is famous. You might want to insist that any CGI program on your server written in Perl use the -T argument, which turns on Perl's taint system that prevents user data from being passed to the shell without going through some sort of filter—but of course, such checks can be ignored, and not all CGI programs are written in Perl.

In short, there isn't any perfect solution, which means that at some point you will have to decide whether to make your system safer (but with angry users), or more exposed to possible damage (but with users satisfied with their ability to run CGI programs of their choosing).


