Configuring, Tuning and Debugging Apache
Even moderately popular web sites tend to run more than one copy of Apache at a time. Older servers would wait until a new connection came in before deciding whether to “fork” off a new copy of the existing server process. The authors of Apache abandoned this method in favor of “pre-forking”, meaning that Apache creates a large number of child processes immediately upon startup.
Each child process handles only one HTTP connection at a given time, meaning that there must always be at least as many Apache processes running as the number of simultaneous connections a site should handle. This maximum number is set with the MaxClients directive, which defaults to 150. If MaxClients is set too low, then one or more users may end up waiting until an Apache process finishes handling the previous connection, and is available to service a new one.
Apache dynamically changes the number of servers available depending on the number of requests that it gets, based on hints in httpd.conf. The MinSpareServers and MaxSpareServers directives tell Apache how many extra servers to keep around in preparation for incoming requests. If the number of spare servers ever goes below MinSpareServers, Apache spawns several new servers. By contrast, if there are more unused servers than defined in MaxSpareServers, Apache will kill off the extra ones.
If the server starts and responds successfully, but is taking a long time to accept connections, it could be that you have not told Apache to start enough servers. Try increasing either MaxSpareServers or MaxClients so that fewer people will have to wait for a free server to handle their request.
Of course, adding new servers is a potentially large drain on the computer's resources, consuming more CPU time and memory. mod_perl is a particularly large user of memory; so you should be more conservative when adding new Apache processes that include mod_perl. Use the standard Linux free command, which displays the amount of available physical and virtual memory, to get a better understanding of where the memory is going. I also like to use top, which displays, among other things, the amount of CPU and memory each process is consuming.
Because web servers need to respond to requests as quickly as possible, and because virtual memory is far slower than physical RAM, you should pay particularly close attention to virtual memory usage and minimize its use.
Small web sites that run one or more database servers (such as MySQL or PostgreSQL) on the same computer as a web server can find themselves in an unenviable bind. As the number of visitors to a dynamically generated site increases, the number of Apache processes must also increase. But in order to service all of these visitors, the number of database connections must also increase. At a certain point, a site becomes a victim of its own success, with the database and web site competing for system resources. For this reason, most popular database-backed sites separate the two functions onto at least two computers, with one or more database servers connected to one or more HTTP servers.
One nice way to get a snapshot of the current Apache status is with the mod_status module. mod_status describes the current state of every HTTP server, whether it is waiting for a new connection, reading the request, handling the request or writing a response.
mod_status is compiled into Apache by default, meaning that all you need to do in order to activate it is to set the appropriate directives, and set the default handler, or request-handling subroutine, to be “server-status”. Any URL defined to have a handler of “server-status” then produces a status listing, ignoring the rest of the user's request.
It is thus most common for mod_status to be activated for only one URL. For example, we can create the virtual “/server-status” URL on our web server, such that anyone visiting /server-status will be shown the output from mod_status. We also indicate that Apache should always produce a full status listing, rather than the simple version. Here is one such simple configuration:
<Location /server-status> SetHandler server-status </Location> ExtendedStatus On
Once I put those four lines inside of httpd.conf and restart Apache—or send it a HUP signal—I get the following output from the /server-status URL:
Server Version: Apache/1.3.12 (UNIX) mod_perl/1.24 Server Built: Mar 29 2000 12:25:42 Current Time: Friday, 21-Jul-2000 16:02:51 IDT Restart Time: Friday, 21-Jul-2000 16:02:48 IDT Parent Server Generation: 2 Server uptime: 3 seconds Total accesses: 0 - Total Traffic: 0 kB CPU Usage: u0 s0 cu0 cs0 0 requests/sec - 0 B/second - 1 requests currently being processed, 4 idle serversThe status information begins with a fair amount of text indicating how long the server has been running, and how many times people have accessed the server. It also indicates just how many bytes are being served by this web process and how many servers are sitting idle. mod_status thus provides a nice window into the world of the Apache server, allowing us to see whether we have defined MaxSpareServers in the most resource-efficient manner.
mod_status then produces output in the following format, which can seem cryptic at first:
W____............................................. ................................................. ................................................. .................................................
Each “.” character represents a potential Apache server process which is currently not running. Those that are waiting for a new connection are represented by “_”; those that are reading input from the user's HTTP request are represented by “R”; and writing their output to the user's browser are represented by “W”. Not all letters may be visible at a given time; the current Apache status changes dynamically, and the output you see from mod_status will change to reflect that.
Following this display, we get a play-by-play view of what each active process is doing. We can see which connections are taking a long time to be processed, which connections are the most popular each month, and nearly any other facet.
Of course, it is normally a bad idea to open up your status information to the entire world. Luckily, you can use the “Order”, “Deny” and “Allow” directives to restrict access to a set of IP addresses or to an entire domain. For example:
<Location /server-status> SetHandler server-status Order deny,allow Deny from all Allow from .lerner.co.il </Location>
With the above configuration, mod_status will display results only for IP addresses in my domain. Requests coming from another domain will get an HTTP response indicating that access is forbidden to them.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Roll your own dynamic dns
5 hours 38 sec ago
- Please correct the URL for Salt Stack's web site
8 hours 12 min ago
- Android is Linux -- why no better inter-operation
10 hours 27 min ago
- Connecting Android device to desktop Linux via USB
10 hours 55 min ago
- Find new cell phone and tablet pc
11 hours 54 min ago
13 hours 22 min ago
- Automatically updating Guest Additions
14 hours 31 min ago
- I like your topic on android
15 hours 17 min ago
- This is the easiest tutorial
21 hours 53 min ago
- Ahh, the Koolaid.
1 day 3 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?