Apache 2.0: The Internals of the New, Improved
Apache's developers have always emphasized the security, correctness and flexibility of the server. However, as of Apache 1.3, many efforts were directed towards bringing performance up to par with other high-end web servers were minimal. With the continuous explosive growth of web traffic, Apache 2.0 tries to improve its throughput without compromising its other qualities.
Web servers have several key performance determinants. Some of these factors include the amount of memory available to hold the document tree, disk bandwidth, network bandwidth and CPU cycles. In most cases, people add to or upgrade the hardware to improve the performance of their web servers. Nevertheless, with the explosive growth of the Internet and its increasingly important role in our lives, the traffic on the Internet is growing at over 100% every six months. Thus, the workload on the servers is increasing rapidly and these servers are very easily overloaded. Several options exist to overcome this problem, besides hardware upgrades or additions.
For very busy web servers, the kernel overhead of switching tasks and doing I/O becomes a problem. Apache provides a solution for the high traffic problem through the mod_mmap_static module. This module ties files into the virtual memory space and avoids the overhead of “open” and “read” system calls to pull them in from the filesystem. This process can result in a good increase in speed when the server has enough memory to cache the whole document tree.
Furthermore, to improve the performance and to serve more requests per second, administrators can run a specialized web server that handles simple requests and passes everything else on to Apache. Another approach that cuts the operating system overhead is to have a small HTTP server built into the kernel itself. These two approaches are discussed later (see HTTPD Accelerators).
Apache modules through version 1.3 wrote directly to the TCP connection back to the client. This arrangement was simple and efficient, but it lacked flexibility.
An example of this inflexibility would be secured transactions over SSL. To perform encrypted communications, the SSL module must intercept all traffic between the client and the handler module. With no abstraction layer in place, this was a difficult task made even more difficult by the cryptography laws of the 1990s that prohibited adding convenient hooks. Administrators wanting to run secure sites had the choice of applying inelegant patch sets to the Apache source or using a proprietary and perhaps incompatible binary distribution.
In Apache 2.0 (with APR), all I/O is done through abstract I/O layer objects. This arrangement allows modules to hook into each other's streams. It will then be possible for SSL to be implemented through the normal module interface rather than requiring special hooks. I/O layers also help internationalized sites by providing a standard place to do character set translation.
In addition, with later Apache releases, I/O layers may support a “most requested module” feature that will have one module filter the output of another. However, this may not happen with Apache 2.1.
The original reason for creating Apache 2.0 was to solve scalability problems. The first proposed solution was to have a hybrid web server, one that has both processes and threads. This solution provides the reliability that comes with not having everything in one process, combined with the scalability that threads provide. However, this approach has no perfect way to map requests to either a thread or a process.
On Linux, for instance, it is best to have multiple processes, each with multiple threads serving the requests. If a single thread dies, the rest of the server will continue to serve more requests and the server will not be affected. On platforms that do not handle multiple processes well, such as Windows, one process with multiple threads is required. On the other hand, platforms with no thread support had to be taken into account, and therefore it was necessary to continue with the Apache 1.3 method of preforking processes to handle requests.
The mapping issue can be handled in multiple ways, but the most desirable way is to enhance the module features of Apache. This was the reasoning behind introduction of multiple-processing modules (MPMs). MPMs determine how requests are mapped to threads or processes. The majority of users will never write an MPM or even know they exist. Each server uses a single MPM, and the correct MPM for a given platform is determined at compile time.
Currently, five options are available for MPMs. All of them, except possibly the OS/2 MPM, retain the parent/child relationships from Apache 1.3, which means that the parent process will monitor the children and make sure that an adequate number is running.
MPMs offer two important benefits:
1. Apache can support a wide variety of operating systems more cleanly and efficiently. In particular, the Windows version of Apache is now much more efficient, since mpm_winnt can use native networking features in place of the POSIX layer used in Apache 1.3. This benefit also extends to other operating systems that implement specialized MPMs.
2. The server can be customized better for the needs of the particular site. For example, sites that need a great deal of scalability can choose to use a threaded MPM, while sites requiring stability or compatibility with older software can use a “preforking” MPM. Additionally, special features like serving different hosts under different user IDs (perchild) can be provided.
The prefork MPM implements a non-threaded, preforking web server that handles request in a manner similar to the default behavior of Apache 1.3 on UNIX. A single control process is responsible for launching child processes that listen for connections and serve them as they arrive.
Apache always tries to maintain several spare or idle server processes, which are ready to serve incoming requests. In this way, clients do not need to wait for a new child process to be forked before their requests can be served.
The StartServers, MinSpareServers, MaxSpareServers and MaxServers (set in /etc/httpd.conf) regulate how the parent process creates children to serve requests. In general, Apache is self-regulating, so most sites do not need to adjust these directives from their default values. Sites that need to serve more than 256 simultaneous requests may need to increase MaxClients, while sites with limited memory may need to decrease MaxClients to keep the server from thrashing.
While the parent process is usually started as root under UNIX in order to bind to port 80, the child processes are launched by Apache as less-privileged users. The User and Group directives are used to set the privileges of the Apache child processes. The child processes must be able to read all the content that will be served but should have as few privileges as possible beyond that.
MaxRequestsPerChild controls how frequently the server recycles processes by killing old ones and launching new ones.
The PTHREAD MPM is the default for most UNIX-like operating systems. It implements a hybrid multi-process multi-threaded server. Each process has a fixed number of threads. The server adjusts to handle load by increasing or decreasing the number of processes.
A single control process is responsible for launching child processes. Each child process creates a fixed number of threads as specified in the ThreadsPerChild directive. The individual threads then listen for connections and serve them when they arrive. The PTHREAD MPM should be used on platforms that support threads and that possibly have memory leaks in their implementation. This may also be the proper MPM for platforms with user-land threads, although testing at this point is insufficient to prove this hypothesis.
When compiled with the DEXTER MPM, the server starts by forking a static number of processes that will not change during the life of the server. Each process will create a specific number of threads. When a request comes in, a thread will accept it and answer it. When a child process sees that too many of its threads are serving requests, it will create more threads and make them available to serve more requests (see Figure 2).
Figure 2. Dexter MPM Model
The DEXTER MPM should be used on most modern platforms capable of supporting threads. It will create a light load on the CPU while serving the most requests possible.
The WINNT MPM is the default for the Windows NT operating systems. It uses a single control process, which launches a single child process that in turn creates threads to handle requests.
The PERHILD MPM implements a hybrid multiprocess, multithreaded web server. A fixed number of processes create threads to handle requests. Fluctuations in load are handled by increasing or decreasing the number of threads in each process.
A single control process launches the number of child processes indicated by the NumServers directive at server startup. Each child process creates threads as specified in the StartThreads directive. The individual threads then listen for connections and serve them when they arrive.
An MPM must be chosen during the configuration phase and compiled into the server. Compilers are capable of optimizing many functions if threads are used, but only if they know that threads are being used. Because some MPMs use threads on UNIX and others don't, Apache will always perform better if the MPM is chosen at configuration time and built into Apache.
To choose the desired MPM, you need to use the argument --with-mpm= NAME with the ./configure script, where NAME is the name of the desired MPM (dexter, mpmt_beos, mpmt_pthread, prefork, pmt_os2, perchild).
Once the server has been compiled, one can determine which MPM was chosen by using % httpd -l. This command will list every module that is compiled into the server, including the MPM.
The following list identifies the default MPM for every platform:
<il>BeOS: mpmt_beos<il>OS/2: spmt_os2<il>UNIX: threaded<il>Windows: winnt
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- Non-Linux FOSS: libnotify, OS X Style
- UX Designer
- RSS Feeds
- It is quiet helping
1 hour 38 min ago
1 hour 55 min ago
- Reachli - Amplifying your
3 hours 12 min ago
4 hours 56 sec ago
- good point!
4 hours 3 min ago
- Varnish works!
4 hours 12 min ago
- Reply to comment | Linux Journal
4 hours 42 min ago
- Reply to comment | Linux Journal
7 hours 8 min ago
- Reply to comment | Linux Journal
11 hours 8 min ago
- Yeah, user namespaces are
12 hours 24 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?