Stress Testing an Apache Application Server in a Real World Environment

Testing procedures and hints so you can find out how much traffic your Web application system can support.
Testing

The actual testing of the systems is similar to a ballet in terms of the level of choreography necessary to make everything run in concert. The absolute most essential act necessary to facilitate this is time synchronization. Having all machine times set as close as possible to one another other is imperative; without this simple step, it is impossible to correlate actions with events. Setting accurate time across our testing network should be our first task in beginning testing.

The second task for testing should be to create our flood configuration file. There are many ways to create the flood configuration file, but one method that creates some usable results is to parse the production Web server's access logs. A simple Perl script can be created to parse the log file and output the correct format, an XML configuration file. This method also is one of the easiest ways to create scripted sessions that reflect actual system usage the way a real visitor would use it.

The third task we need to perform is setting up the system monitoring on the application and database servers. As described above, we use the sysstat program to monitor the system's production environments. The program sadc is the back end process for collecting the data, and the most simple form for setting up the sysstat monitoring is:

     /usr/lib/sa/sadc 1 [# of seconds to report] outfile

As is probably obvious, it is important to capture enough data to encompass the entire duration of the testing. The above command should be started on both testing systems used to serve data, for example, the Web and database servers.

The fourth step should be to start up tcpdump monitoring on the traffic generation/monitoring system. The easiest way to do this is to issue the command tcpdump -w outfile. This command outputs all network data to the outfile specified in a format easily loadable into an analysis tool, such as Ethereal.

Now that all of our monitoring is set up and running on the appropriate systems, the last part is to begin actual traffic generation by starting flood. In this stage, it is a good idea to start slow with little traffic and increment the volume up at a consistent pace until the limits of the server are reached.

In the previous two sections, we looked at the setup and the actual testing on our test network, but we have not looked at the data the software we use generates. For sample data, please see Listing 2.

Listing 2. Sample Data from Testing Software

To utilize the generated data, we go back to our old friend, Perl. For both tcpdump and flood, the individual data is measured in utime and easily can be compared and analyzed based on the reported times. The raw output of the flood report is:

     Absolute time started
     relative time (to first column) to open the socket
     relative time to write to the socket
     relative time to read the socket
     relative time to close the socket
     OK or FAIL notification
     the thread or PID of the farmer making the request
     the URL of the target without query strings

Some of the example report processing scripts included in the flood source output the raw data into a simple yet readable output. Either by using these scripts as they are or by using them as a starting point to build a different report, it is possible to glean some essential data from the flood report. One method I have used to identify quickly trends in the data is to run the raw flood output through a Perl script that translates the utime values to a more "readable" number by dividing them by one million. This modified output then is passed to a GNUplot script (see Listing 3), which creates a nice graphic where trends can be seen at a glance. It then is trivial to match up which offending activity happened at what time and to see across the entire network what was going on with all systems at that moment. Once the offending activity is determined, it is quite possible to adjust the systems to correct the problem and then retest using the same method.

Listing 3. Modified Output Sent to GNUplot Script

The last item I want to address is the tcpdump data. The easiest method of working with tcpdump files is to use Ethereal. Ethereal is a graphical interface that loads all of the tcpdump data into an easy-to-read format. Its best feature, however, is its ability to trace or follow an individual connection--very handy in tracking down problematic connections.

Conclusions and Recommendations

Every Web-based application is different every other one, and no two pieces of hardware are exactly the same when running these types of applications. It is difficult to say exactly where problems might arise or where things might break. Stress testing requires an intimate knowledge of the software, the systems and the network that encompass the operating environment. These are the truisms of this type of activity, and although the challenges and learning curve is daunting, it is well worth the effort.

Stress testing requires a degree of patience, as rushing the testing can result in collecting bad data and/or ambiguous results. Always take the time to understand fully the results of the previous test before continuing on to the next round.

Drawing on my own experience in these types of tests and the resulting system tuning, I have reached these conclusions about dynamic Web-based application performance. Whenever possible:

  • Separate the application from the db.

  • Use as many diverse data channels as possible (i.e., separate drives for data and system on separate channels or controllers).

  • Use as good a machine as is practical.

  • Databases are memory hungry--feed them.

  • Understand relational database theory and the five normal forms.

  • Understand good development practices and follow them.

  • RAID 5 sounds like a great idea until a database lives on it and that database liberally uses INSERTS and UPDATES. If you need hardware redundancy, there are more database-friendly ways to accomplish it.

  • Just because it sounds like a great idea to put lots of XML into a db and let the front end parse it out, think again.

  • Remember that your servers can communicate only as fast as the network goes. Use good networking components and cables.

I have given you a brief overview of how to stress test Web application systems, as well as some of the tools to use. Now it's your turn to set up everything and use what you have learned. Remember to be creative and don't be afraid to hunt down new or better tools to do the job. The better your information, the better you can understand how to answer the questions listed at the beginning of this article.

Brad Bartram is a network administrator for Dyrect Media Group in Buffalo, New York. In his spare time he plays and teaches guitar.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

./flood configuration-file

Meline's picture

./flood configuration-file *gt; output.file

That sikiş should have been > which would have given >, i.e.:

./flood configuration-file > output.file

Re: Stress Testing an Apache Application Server in a Real World

Anonymous's picture

please, i'm cannot run this command:

/usr/lib/sa/sadc 1 [# of seconds to report] outfile

where can i get sadc, please anybody ..

tq

Re: Stress Testing an Apache Application Server in a Real World

Anonymous's picture

You can get the links for download at the authors site http://perso.wanadoo.fr/sebastien.godard/

Re: Stress Testing an Apache Application Server in a Real World

Anonymous's picture

i'm cannot ru this command

/usr/lib/sa/sadc 1 [# of seconds to report] outfile

when can i get sadc? please help me

Re: Stress Testing an Apache Application Server in a Real World

Anonymous's picture

Hi,

I'm trying to GET an URL protect by user&password and I don't find any info about syntax I needed. I try to simulate a direct telnet to a web server but it doesn't work:

telnet [host] 80
GET http://host/index.html HTTP/1.1
Authorization: Basic [code]
Host: [host]

Any idea.
Thank's.

Re: Stress Testing an Apache Application Server in a Real World

Anonymous's picture

If its standard Basic auth that we're talking about then, the authorization header is just the username and password concatenated together and then base64 encoded so

perl -MMIME::Base64 -e 'print encode_base64("Aladdin:open sesame") . "
";'

will print the value QWxhZGRpbjpvcGVuIHNlc2FtZQ== which you would then put in your authorization header:

telnet [host] 80
GET /protected_document.html HTTP/1.0
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

[server response should come here]

Re: Stress Testing an Apache Application Server in a Real World

Anonymous's picture

Thank you for your answer, but maybe I'm not express very good.
I want the flood XML syntax to simulate this commands; so, how to pass user and password information in stress testing.

Thanks.

Typo

Anonymous's picture

./flood configuration-file *gt; output.file

That should have been > which would have given >, i.e.:

./flood configuration-file > output.file

Re: Typo

Anonymous's picture

I guess you meant

"That should have been > which would have given > ..."

Isn't not previewing a *****? ;-)

you mean: this should be

Anonymous's picture

you mean:
this should be > which would > ...

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix