Learning to use the httpd error log to debug CGI programs
As discussed in my last column (LJ, Issue 34, WWW section), debugging CGI programs is often more difficult than debugging their non-CGI counterparts, if only because our programs aren't interacting directly with the user's terminal. We have only one chance to get input from the user—when the program is invoked—and only one chance to send a response back to the user—just before the program terminates, in the HTML that our program creates.
This difficulty is complicated further by the fact that running CGI programs correctly requires that a number of other items be in place. For example, you must set the permission bits correctly, the web server must be configured to serve CGI programs from your directory, and the correct version of Perl (if you are using Perl for your CGI tasks) must be installed. None of this is very difficult, and it's particularly uncomplicated when compared with the full administration of a Unix system, but it's not quite as simple as compiling and executing other, non-CGI programs.
In the last article in this series, we looked at techniques that you can use when things go wrong. But what happens when all of the permissions and directories are set correctly and you still end up getting a mysterious error on your browser?
The answer is that while web servers send a generic error message back to the user's web browser, they also keep an “error log,” a file into which information about each error is placed. This month, we will examine the error log to see what it contains, look through several errors and how they appear in the error log, and finally consider some ways in which we can use the error log to make debugging easier.
As Unix programmers are well aware, programs typically have at their disposal three basic file descriptors: standard input (stdin), standard output (stdout), and standard error (stderr).
Standard input is typically associated with the user's keyboard; when you are entering text in Emacs or similar program, the program is probably collecting your keystrokes via stdin. Standard output, by contrast, typically goes back to your console—either the window into which you are typing or the entire screen.
Standard error is normally sent to the user's console, and thus is often confused with stdout. But the two are completely separate and can be redirected to different places. You can see this in action by trying the command:
ls *zz* > zz-files
which, on my system, produces the following output on the screen:
ls: *zz*: No such file or directorySince there aren't any files containing zz in their names, the file zz-files is empty. This error message is sent to stderr, which is kept on the screen, while the (empty) list of files is sent to stdout, which we redirected to the file named zz-files.
A CGI program still has access to stdin, stdout and stderr, but these files are used somewhat differently as the program isn't running on the user's console. For a CGI program, stdin reflects the contents of an HTML form as submitted via the POST method. Stdout, not surprisingly, is used to send a response, which is typically not in HTML format, to the user. Finally, stderr is redirected to the HTTP server's error log, which is usually kept far away from visitors to the web site.
Therefore, if our program dies unexpectedly, error messages are directed to the httpd error log rather than to our browser. On the one hand, this is probably a good thing—after all, you probably don't want the world at large to know the number and kind of bugs found in your code. On the other hand, this means that you need to look at a separate file, and that you need to log on to the server computer before you can know exactly what has happened to your program.
If you are renting space on a web server run by a web space provider or Internet service provider—or more importantly, if you are looking to rent space from such a provider—make sure that you have access to these logs. You can certainly write and debug CGI programs without access to the error log, but your life will be infinitely more pleasant if you can get to it.
Each HTTP server logs information in a slightly different way. In this column, I will describe the logs produced by Apache, a server derived from NCSA httpd, which was written and distributed by the National Center for Supercomputing Applications (NCSA). You can get a copy of Apache, as well as documentation relating to this excellent server, at http://www.apache.org/.
In my copy of Apache, the error messages are placed in the file with one line per error. Error messages are in the format:
[DayOfWeek Month Date Time Year] Message
Thus one possible entry in the error log would be:
[Mon Dec 2 16:05:50 1996] Coke machine needs refillingAs you might have guessed, this means that the server recorded the error at 4:05:50 P.M. on Monday, December 2, 1996.
Your httpd probably records all sorts of things in the error log that you probably wouldn't classify as errors. For example, my system's error log contains the following error message:
[Tue Nov 5 17:03:48 1996] access to \ /home/httpd/cgi-bin failed for ahad-haam, reason: \ script not found or unable to stat
The above message means that the server wasn't able to find a particular file on my web server. That's right—every time someone points his web browser to your system and requests a document that doesn't exist, an error message is recorded. This is actually a useful thing to know, both for security reasons (since it might be nice to learn if people are trying to grab sensitive files via your web server) and in order to help users who often but mistakenly request particular files. For example, if a particular file is often requested under an incorrectly spelled name, you can set up an alias or symbolic link on your system.
As interesting and useful as an error log might be for system administrators, network administrators and web masters, we are going to concentrate on the error log's role when trying to debug CGI programs. For example, try the following:
#!/usr/local/bin/perl5 -w
use strict; # Check our syntax strictly
use diagnostics; # Tell us how to fix mistakes
use CGI; # Import the CGI module
# Create an instance of CGI
my $query = new CGI;
# Notice that we are *not* sending a MIME
# header, which we should always remember
# to do.
# Begin the HTML
print $query->start_html(-title =>
"Hello, world!)";
# Print something fairly uninteresting
print "<P>Hello, World Wide Web!</P>\n";
# End the HTML
print $query->end_html();
The above program will work just fine if run from the command line, since the only error is a missing MIME header, which is necessary only for CGI programs. But if you try to run it from a browser, you get the dreaded message:
Server Error. The server encountered an internal error or misconfiguration and was unable to complete your request.Not very useful for debugging your program, particularly since running it from the command line revealed no obvious errors. How can you discover what has gone wrong? By looking at the error log.
On my system, the error log contains the following entry:
[Mon Dec 2 17:32:04 1996] access to /home/reuven/Text/Websmith/test-8.pl failed for ahad-haam, reason: malformed header from script
Not only does Apache tell us that there was an error, and that the program died, but it even presents the reason, which is listed as “malformed header from script”. Not the most obvious message for newcomers, but it tells you what you need to know, namely that the header sent wasn't in the proper format and thus caused an error.
We can now fix the program and remove the error message by sending a MIME header before anything else:
#!/usr/local/bin/perl5 -w
use strict; # Check our syntax strictly
use diagnostics; # Tell us how to fix mistakes
use CGI; # Import the CGI module
# Create an instance of CGI
my $query = new CGI;
# Send the MIME header
print $query->header("text/html");
# Begin the HTML
print $query->start_html(-title =>
"Hello, world!");
# Print something fairly uninteresting
print "<P>Hello, World Wide Web!</P>\n";
# End the HTML
print $query->end_html();
Sure enough, the fixed program doesn't produce any new messages in the error log.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development




29 min 57 sec ago
11 hours 10 min ago
16 hours 56 min ago
17 hours 13 min ago
19 hours 6 min ago
21 hours 15 sec ago
1 day 3 hours ago
1 day 4 hours ago
1 day 6 hours ago
1 day 11 hours ago