Stress Testing an Apache Application Server in a Real World Environment
The actual testing of the systems is similar to a ballet in terms of the level of choreography necessary to make everything run in concert. The absolute most essential act necessary to facilitate this is time synchronization. Having all machine times set as close as possible to one another other is imperative; without this simple step, it is impossible to correlate actions with events. Setting accurate time across our testing network should be our first task in beginning testing.
The second task for testing should be to create our flood configuration file. There are many ways to create the flood configuration file, but one method that creates some usable results is to parse the production Web server's access logs. A simple Perl script can be created to parse the log file and output the correct format, an XML configuration file. This method also is one of the easiest ways to create scripted sessions that reflect actual system usage the way a real visitor would use it.
The third task we need to perform is setting up the system monitoring on the application and database servers. As described above, we use the sysstat program to monitor the system's production environments. The program sadc is the back end process for collecting the data, and the most simple form for setting up the sysstat monitoring is:
/usr/lib/sa/sadc 1 [# of seconds to report] outfile
As is probably obvious, it is important to capture enough data to encompass the entire duration of the testing. The above command should be started on both testing systems used to serve data, for example, the Web and database servers.
The fourth step should be to start up tcpdump monitoring on the traffic generation/monitoring system. The easiest way to do this is to issue the command tcpdump -w outfile. This command outputs all network data to the outfile specified in a format easily loadable into an analysis tool, such as Ethereal.
Now that all of our monitoring is set up and running on the appropriate systems, the last part is to begin actual traffic generation by starting flood. In this stage, it is a good idea to start slow with little traffic and increment the volume up at a consistent pace until the limits of the server are reached.
In the previous two sections, we looked at the setup and the actual testing on our test network, but we have not looked at the data the software we use generates. For sample data, please see Listing 2.
To utilize the generated data, we go back to our old friend, Perl. For both tcpdump and flood, the individual data is measured in utime and easily can be compared and analyzed based on the reported times. The raw output of the flood report is:
Absolute time started relative time (to first column) to open the socket relative time to write to the socket relative time to read the socket relative time to close the socket OK or FAIL notification the thread or PID of the farmer making the request the URL of the target without query strings
Some of the example report processing scripts included in the flood source output the raw data into a simple yet readable output. Either by using these scripts as they are or by using them as a starting point to build a different report, it is possible to glean some essential data from the flood report. One method I have used to identify quickly trends in the data is to run the raw flood output through a Perl script that translates the utime values to a more "readable" number by dividing them by one million. This modified output then is passed to a GNUplot script (see Listing 3), which creates a nice graphic where trends can be seen at a glance. It then is trivial to match up which offending activity happened at what time and to see across the entire network what was going on with all systems at that moment. Once the offending activity is determined, it is quite possible to adjust the systems to correct the problem and then retest using the same method.
The last item I want to address is the tcpdump data. The easiest method of working with tcpdump files is to use Ethereal. Ethereal is a graphical interface that loads all of the tcpdump data into an easy-to-read format. Its best feature, however, is its ability to trace or follow an individual connection--very handy in tracking down problematic connections.
Every Web-based application is different every other one, and no two pieces of hardware are exactly the same when running these types of applications. It is difficult to say exactly where problems might arise or where things might break. Stress testing requires an intimate knowledge of the software, the systems and the network that encompass the operating environment. These are the truisms of this type of activity, and although the challenges and learning curve is daunting, it is well worth the effort.
Stress testing requires a degree of patience, as rushing the testing can result in collecting bad data and/or ambiguous results. Always take the time to understand fully the results of the previous test before continuing on to the next round.
Drawing on my own experience in these types of tests and the resulting system tuning, I have reached these conclusions about dynamic Web-based application performance. Whenever possible:
Separate the application from the db.
Use as many diverse data channels as possible (i.e., separate drives for data and system on separate channels or controllers).
Use as good a machine as is practical.
Databases are memory hungry--feed them.
Understand relational database theory and the five normal forms.
Understand good development practices and follow them.
RAID 5 sounds like a great idea until a database lives on it and that database liberally uses INSERTS and UPDATES. If you need hardware redundancy, there are more database-friendly ways to accomplish it.
Just because it sounds like a great idea to put lots of XML into a db and let the front end parse it out, think again.
Remember that your servers can communicate only as fast as the network goes. Use good networking components and cables.
I have given you a brief overview of how to stress test Web application systems, as well as some of the tools to use. Now it's your turn to set up everything and use what you have learned. Remember to be creative and don't be afraid to hunt down new or better tools to do the job. The better your information, the better you can understand how to answer the questions listed at the beginning of this article.
Brad Bartram is a network administrator for Dyrect Media Group in Buffalo, New York. In his spare time he plays and teaches guitar.
- Goldtouch Semi-Vertical Mouse
- My Childhood in a Cigar Box
- Let's Go to Mars with Martian Lander
- Applied Expert Systems, Inc.'s CleverView for TCP/IP on Linux
- Papa's Got a Brand New NAS
- VMware's Clarity Design System
- Panther MPC, Inc.'s Panther Alpha
- Simplenote, Simply Awesome!
- Smith Charts for All