Configuring, Tuning and Debugging Apache

RPMs, mod_perl, apachectl, telnet, mod_status, DSO, apxs, Apache::Status...Scared yet? Mr. Lerner tells us about all of these so we can better manage our web servers.

As longtime users know, Linux, Apache and other free software packages are both more stable and more easily configured than their closed-source counterparts. However, this obviously does not mean that free software advocates are immune from software bugs and configuration problems. And indeed, the complexity of much open-source software can sometimes be frustrating; it's often unclear which option to modify, or even where to begin.

This month, we will look at some of the tools and techniques that webmasters can use when trying to configure, tune and debug their Apache configurations. This cannot be a comprehensive list, simply because so many things can go wrong. However, being able to identify the cause of a problem often makes it relatively easy to come up with a solution, or at least to begin working toward one.

Removing RPMs

I have been using Red Hat Linux for about five years, back from the days in which Red Hat was a small company in Connecticut, rather than a well-known start-up whose name even my mother recognizes. During that time, I have become a big fan of RPM, the Red Hat Package Manager. I still remember when I had to modify makefiles by hand in order to compile software downloaded from the Internet; the fact that I can download a binary version of a program, install it with a single command, and then remove it just as easily, continues to amaze me. (I'm told that Debian's packaging system is even more sophisticated, but I have not yet had the opportunity to try it out.)

System administrators who learn to love RPMs are often reluctant to begin compiling programs from source. Not only does this take more time and more knowledge, but compiling and installing a program from its source code makes removal difficult, months or years later. However, there are certain programs that I insist on compiling from source, despite the inherent disadvantages. One of these is Apache, which is normally installed as part of the Red Hat installation.

Apache itself is a relatively small program. The bulk of the functionality resides in individual modules which are normally incorporated into Apache at compile time. So if you want Apache to automatically correct misspelled URLs, you can include mod_speling (and no, that isn't a typo) in your compilation. If you will be running a server without any CGI programs, you can remove mod_cgi. And the list continues, making it possible to customize Apache according to your exact needs.

Thus, I strongly recommend that everyone compile Apache from scratch. This process is normally quite straightforward, and should take no more than a few minutes on a typical modern computer. The first step toward compiling Apache is to remove any RPMs that might already be on the system, to avoid confusing either the RPM database or yourself regarding a file's exact origins.

To remove the Apache RPM from a Red Hat system, use the following command: rpm -e apache.

To get a better idea of what is going on as your disk drive crunches away, you may turn on the “very verbose” option: rpm -evv apache.

On most systems, this command will not be sufficient by itself. RPM keeps track of not only which files are installed, but also of the dependencies between packages. Since a typical Red Hat installation includes RPMs for mod_perl (which is often broken) and mod_php, you may need to erase these as well:

rpm -evv apache mod_perl mod_php

If removing a package will break a dependency, RPM will exit with a fatal error, indicating which dependencies will not be satisfied if you remove the package in question. You should then consider whether the package (e.g., mod_perl) is only useful with another package (e.g., Apache) or if removing it would cause too many other problems.

Once you have removed any existing Apache RPMs, you can begin to compile and install Apache on your system. (Of course, you can always compile Apache while the RPM exists, removing the RPM just before you install the compiled version.) Download the latest version, which at the time of this writing was 1.3.12, from a local mirror of http://www.apache.org/. Once you have done that, you can unpack it with:

tar -zxvvf apache_1.3.12.tar.gz

The z option uncompresses the tar archive with gunzip before proceeding, and the vv flags ask for “very verbose” output. Both of these options work only with GNU tar, which is the standard tar version on Linux systems.

Once you have unpacked the source code, you can compile Apache with the following commands:

cd apache_1.3.12  # Switch into the Apache directory
./configure           # Get a default configuration
make                  # Compile the source code
make install          # Install Apache under /usr/local/apache/

The above four steps will place the Apache-related programs in /usr/local/apache/bin, logfiles in /usr/local/apache/logs, HTML files in /usr/local/apache/htdocs, and CGI programs in /usr/local/apache/cgi-bin. The “make install” command must be performed while logged in as root.

To run Apache, use the apachectl program which is installed by default in /usr/local/apache/bin. apachectl is a shell script which makes it relatively easy to start or stop Apache, ensuring that only one Apache process will run on a system at a given time. To start Apache, use:

/usr/local/bin/apachectl start
You may need to insert this line in one of your startup files, such as
/etc/rc.d/rc.local, to ensure that Apache starts
up when your
computer is booted. This is especially true if you removed the RPM
version of Apache using the above instructions, since the RPM comes
with a startup file that is automatically placed (and invoked) inside of
/etc/rc.d/init.d. apachectl can also be used to shut down the server:
/usr/local/bin/apachectl stop

One of the first things that Apache does when it starts up is to look through its configuration file, traditionally called httpd.conf and located (by default) in /usr/local/apache/conf. This file contains a number of commands, known as “directives”, followed by one or more values. We can thus set the name of the server explicitly with the directive:

ServerName www.lerner.co.il
Each Apache module is allowed to define its own directives which control the program's behavior when invoking that module. For example, mod_userdir installs the UserDir directive, indicating which directory name should be treated specially inside of each user's home directory, for the purposes of creating sites on the Web.

But what happens if mod_userdir is not installed? Then Apache will not know what to do with the UserDir directive, and will exit with a fatal error. The solution is to run apache configtest, which reads httpd.conf and checks it to ensure that all of the directives are known and have legal values. If all of the directives have legal definitions, apachectl will respond with “OK”.

Another partial solution to this issue is to put all module-specific directives inside of an <IfModule> section. <IfModule> tells Apache that it should pay attention to one or more directives only if a particular module is actually loaded. Thus, we can always define our UserDir directive, assuming it is wrapped in the following section:

<IfModule mod_userdir.c>
    UserDir public_html
</IfModule>

<IfModule> is particularly useful when working with modules compiled as DSOs, because of the flexibility it adds.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix