Stress Testing an Apache Application Server in a Real World Environment
We've all had an experience in which the software is installed on the servers, the network is connected and the application is running. Naturally, the next step is to think, "I wonder how much traffic this system can support?" Sometimes the question lingers and sometimes it passes, but it always presents itself. So, how do we figure out how much traffic our server and application can handle? Can it handle only a few active clients or can it withstand a proper Slashdotting? To appreciate fully the challenges one faces in trying to answer these questions, we must first understand the dynamic application and how it works.
A traditional dynamic application has five main components: the application server, the database server, the application, the database and the network. In the open-source world, the application server usually is Apache. And, often, Apache is running on Linux.
The database server can be almost anything that can do the job; for most smaller applications, this tends to be MySQL. In this article, I highlight the open-source PostgreSQL server, which also runs on Linux.
The application itself can be almost anything that fits the project requirements. Sometimes it makes sense to use Perl, sometimes PHP, sometimes Java. It is beyond the scope of this article to determine the benefits or liability of a particular platform, but a firm understanding of the best tool for the job is necessary to plan properly for adequate performance in a running application.
The database itself can mean the difference between a maximum load of one user and 5,000 users. A bad schema can be the death of an application, while a good schema can make up for a multitude of other shortcomings.
The network tends to be the forgotten part of the equation, but it can be as detrimental as bad application code or a bad schema. A noisy network can slow intra-server communications dramatically. It also can introduce errors and other unknowns into communications that, in turn, have unknown results on the running code.
As you have probably guessed, finding where our optimal performance lies and pushing those limits is more than a minor challenge. Like the formula-one race car that runs with almost absolute technical efficiency, the five main components of the Web-based application determine whether the system can handle its load optimally. By looking at those components and measuring how they react under certain circumstances, we can use that data to better tune the system as a whole.
To begin the testing, we need to create an environment that facilitates micro-management of the five components. Being as most enterprise class applications are based on large proprietary hardware configurations, setting up a testing configuration often is prohibitive in cost. But, one of the advantages of the open-source model is a lot of the configurations are based on commodity hardware. The commodity hardware configuration, therefore, is the basic assumption used throughout the testing setup. This is not to say that a setup based on large proprietary hardware is not as valid or that the methods outlined are not compatible; it simply is more expensive.
We first need to set up a testing network. For this we use three computers on a private network segment. The systems should be exact replicas of the servers going into production or ones that already exist in the production environment. This, in a simple sense, accounts for the application/Web server and the database server, with the third system being a traffic generator and monitor. These three computers are connected through a hub for testing, because the shared nature of the hub facilitates monitoring network traffic. A better but more expensive solution would replace the hub with a switch and introduce an Ethernet tap into the configuration. The testing network we use, though, is a fairly accurate representation of the network topology that exists in the DMZ or behind the firewall of a live network.
Accurately monitoring the activity of the network and the systems involved in serving the applications requires some software, the first of which is the operating system. In this article, I use Red Hat 7.3, although there are few Red Hat-isms that are specific to these setups and tests. To get the best performance from the server machines, it is a good idea to make sure only the most necessary services are running. On the application server, this list includes Apache and SSH (if necessary); on the database server the list normally includes PostgreSQL and SSH (again, if necessary). As a general preference, I like to make sure all critical services, including Apache, PostgreSQL and the kernel itself are compiled from source. The benefit of doing this is ensuring only the necessary options are activated and nothing extraneous is active and taking up critical memory or processor time.
On the application and database servers, a necessary component that should be included is the sysstat package. This package normally is installed on the default Red Hat installation. For other distributions, the sysmon package can be found here and compiled from source. Sysstat is a good monitoring tool for most activities, as it can display at a glance almost all of the relevant information about a running system, including network activity, system loads and much more. This package works by polling data at specified intervals and is useful for general system monitoring. For our tests, we run sysstat in a more active mode, from the command line--a topic discussed in more depth later in this article.
It is a good idea to be familiar with the tools collected in the sysstat package, especially the sar and sadc programs. The man page for both of these programs provides a wealth of details. One of the limitations of the sysstat package is it has a minimum data sampling duration of one second. In my experience with this type of testing, a one-second sample is adequate for assessing where problems begin to creep into the configuration.
As we move to a different testing tool, we also are moving to a different portion of our testing network, the network itself. One of the best tools for this task is tcpdump. Tcpdump is a general purpose network data collection tool and, like sysstat, is available in binary form for most distributions, as well as in source code from www.tcpdump.org.
About now you may be asking why we are looking at raw network data. On occasion, I have errors be introduced into the communications between servers. For instance, sometimes data packets can become mangled in transit. Raw network data, then, is a great resource to have to refer back to in the event of a problem that cannot be diagnosed easily.
Tcpdump could be an article unto itself due to the depth and complexity of the subject of networking as well as the program itself. Specific usage examples follow in the next section, in which the actual testing procedure is explained. For now, tcpdump should be installed on our traffic generator system.
The last major component we need for our testing is a piece of software named flood, which is written by the Apache Group and available at www.apache.org. Flood still is considered alpha software and, therefore, is not well documented. On-line support also is limited, as few people seem to use it.
To begin, we need to download the flood source. We can get the source from here. A nice and simple document on how to build the flood source can be found there as well. If the Web application to be tested runs over https, reading this document is a must.
In it's most simple form, the method to build the software is:
tar -zxvf flood-0.x.tar.gz cd flood-0.x/ ./buildconf ./configure --disable-shared make all
Flood is executed and run from its source directory using the newly created ./flood executable.
The "./flood" syntax is quite simple. It generally follows the format:
./flood configuration-file > output.file
The configuration file is where the real work and power of flood is revealed, and several example files are provided in the ./examples directory in the flood source. It is a good idea to have a working knowledge of their construction, as well as some knowledge of XML. See Listing 1 for an example configuration file.
The general form of the configuration file is:
<flood> <urllist></urllist> <profile></profile> <farmer></farmer> <farm></farm> <seed></seed> </flood>
The <urllist> is where the specific URLs are placed that flood uses to step through and access the application. Due to the way flood processes these URLs under certain configurations, it is possible to simulate a complete session a visitor may make to the Web application.
The <profile> section is where specifics are set about how the file should be processed as well as which URLs should be used. This section uses several tags to define the behavior of the flood process. They are:
<name> <description> <useurllist> <profiletype> <socket> <report> <verify_resp>
These seven 7 tags are relatively well defined in the configuration file examples. The other main sections--farmer, farm and seed--set the parameters of how many times to run through the list, how often and the seed number for easy test duplication.
A real world note about flood from my own experience: if the application has rigidly defined URLs that reference individual pages, the stock flood report is useful with little modification. If, however, the Web application uses a few pages that refresh depending on variables and change accordingly, as is the case with most dynamic Web applications, flood results can be difficult to use. In the latter case, flood's primary usefulness comes in the scripting of traffic to a test environment for the purpose of simulating traffic. It is important to understand the benefits and the shortcomings of any applications being used; testing a Web application is no different.
The actual testing of the systems is similar to a ballet in terms of the level of choreography necessary to make everything run in concert. The absolute most essential act necessary to facilitate this is time synchronization. Having all machine times set as close as possible to one another other is imperative; without this simple step, it is impossible to correlate actions with events. Setting accurate time across our testing network should be our first task in beginning testing.
The second task for testing should be to create our flood configuration file. There are many ways to create the flood configuration file, but one method that creates some usable results is to parse the production Web server's access logs. A simple Perl script can be created to parse the log file and output the correct format, an XML configuration file. This method also is one of the easiest ways to create scripted sessions that reflect actual system usage the way a real visitor would use it.
The third task we need to perform is setting up the system monitoring on the application and database servers. As described above, we use the sysstat program to monitor the system's production environments. The program sadc is the back end process for collecting the data, and the most simple form for setting up the sysstat monitoring is:
/usr/lib/sa/sadc 1 [# of seconds to report] outfile
As is probably obvious, it is important to capture enough data to encompass the entire duration of the testing. The above command should be started on both testing systems used to serve data, for example, the Web and database servers.
The fourth step should be to start up tcpdump monitoring on the traffic generation/monitoring system. The easiest way to do this is to issue the command tcpdump -w outfile. This command outputs all network data to the outfile specified in a format easily loadable into an analysis tool, such as Ethereal.
Now that all of our monitoring is set up and running on the appropriate systems, the last part is to begin actual traffic generation by starting flood. In this stage, it is a good idea to start slow with little traffic and increment the volume up at a consistent pace until the limits of the server are reached.
In the previous two sections, we looked at the setup and the actual testing on our test network, but we have not looked at the data the software we use generates. For sample data, please see Listing 2.
To utilize the generated data, we go back to our old friend, Perl. For both tcpdump and flood, the individual data is measured in utime and easily can be compared and analyzed based on the reported times. The raw output of the flood report is:
Absolute time started relative time (to first column) to open the socket relative time to write to the socket relative time to read the socket relative time to close the socket OK or FAIL notification the thread or PID of the farmer making the request the URL of the target without query strings
Some of the example report processing scripts included in the flood source output the raw data into a simple yet readable output. Either by using these scripts as they are or by using them as a starting point to build a different report, it is possible to glean some essential data from the flood report. One method I have used to identify quickly trends in the data is to run the raw flood output through a Perl script that translates the utime values to a more "readable" number by dividing them by one million. This modified output then is passed to a GNUplot script (see Listing 3), which creates a nice graphic where trends can be seen at a glance. It then is trivial to match up which offending activity happened at what time and to see across the entire network what was going on with all systems at that moment. Once the offending activity is determined, it is quite possible to adjust the systems to correct the problem and then retest using the same method.
The last item I want to address is the tcpdump data. The easiest method of working with tcpdump files is to use Ethereal. Ethereal is a graphical interface that loads all of the tcpdump data into an easy-to-read format. Its best feature, however, is its ability to trace or follow an individual connection--very handy in tracking down problematic connections.
Every Web-based application is different every other one, and no two pieces of hardware are exactly the same when running these types of applications. It is difficult to say exactly where problems might arise or where things might break. Stress testing requires an intimate knowledge of the software, the systems and the network that encompass the operating environment. These are the truisms of this type of activity, and although the challenges and learning curve is daunting, it is well worth the effort.
Stress testing requires a degree of patience, as rushing the testing can result in collecting bad data and/or ambiguous results. Always take the time to understand fully the results of the previous test before continuing on to the next round.
Drawing on my own experience in these types of tests and the resulting system tuning, I have reached these conclusions about dynamic Web-based application performance. Whenever possible:
Separate the application from the db.
Use as many diverse data channels as possible (i.e., separate drives for data and system on separate channels or controllers).
Use as good a machine as is practical.
Databases are memory hungry--feed them.
Understand relational database theory and the five normal forms.
Understand good development practices and follow them.
RAID 5 sounds like a great idea until a database lives on it and that database liberally uses INSERTS and UPDATES. If you need hardware redundancy, there are more database-friendly ways to accomplish it.
Just because it sounds like a great idea to put lots of XML into a db and let the front end parse it out, think again.
Remember that your servers can communicate only as fast as the network goes. Use good networking components and cables.
I have given you a brief overview of how to stress test Web application systems, as well as some of the tools to use. Now it's your turn to set up everything and use what you have learned. Remember to be creative and don't be afraid to hunt down new or better tools to do the job. The better your information, the better you can understand how to answer the questions listed at the beginning of this article.
Brad Bartram is a network administrator for Dyrect Media Group in Buffalo, New York. In his spare time he plays and teaches guitar.