Why Application Servers Crash and How to Avoid It
Too often, the success of a web site brings its doom. After a good publicity blitz, or a healthy growth of its popularity, the number of users visiting a site becomes very large. Then, at certain point, these users witness painfully slow response time, then one crash, then another, etc. The site administrators add some RAM, more machines, faster processors, and the crashes keep coming. At some point, rumors spread, and the users stop visiting.
Interesting web sites all use dynamic content: an application server drives the show and is helped by an RDBMS (relational database management system) to store the data. Web servers don't use a lot of static HTML pages anymore, which is too bad because when they did, they did not crash so much. Why? And, why does the application server crash? We will look at a little bit of queuing theory to analyze various scenarios that lead to web site crashes.
Queuing theory is about using math (statistics) to model the behavior of a server system. The server is not necessarily a computer. Queuing theory can be used to model the waiting queue at a bank teller or the flow of cars at the entrance of a bridge. It is also a good tool to predict the performance of telecom and computing systems.
To simplify things, we will use the single-server queue model with exponential service times. The words single server basically mean that the systems we will model have only one database server (the number of processors in the machine is not relevant here). Also, the phrase exponential service times means that the response time is random but follows a known mean average with a standard deviation equal to that average. We have to use even more queuing theory buzzwords now and say that we will use the M/M/1 model. We have already said that here we have a single-server (the 1 in M/M/1) and that we assume exponential service times (the second M). The first M indicates that we expect random arrivals following a Poisson distribution. With that all said, we can then use a few neat and simple formulas. First, we need to name a few variables:
[Ed. note: so that everyone may see them, the Greek letters below are presented in text form.]
<lambda> = average number of requests per second
s = average service time for each request
<rho> = utilization: fraction of time that the server is busy
q = average number of requests in the server (waiting and being processed)
tq = average time a request spends in the server (response time as perceived by the user)
w = average number of requests waiting in the server
The first formula is:
<rho> = <lambda> * s
With this formula, we can find the maximum number of requests that the server could process per second:
<lambda>max = 1/s
We will see that peculiar things happen when the number of requests per second gets close to this maximum, <lambda>max.
A few more formulas:
q = <rho> / (1 - <rho>)
tq = s/(1 - <rho>)
w = <rho> 2 / (1 - <rho>)
If we use these formulas, we can see why a web site can become very slow. Let's use queuing theory to model a system with two tiers on the server: the web server and the database server. Also, let's ignore the overhead of the HTTP server as the database server is orders of magnitude slower. Let's say that the average service time of the system is equal to that of the database server.
In Figure 1, we suppose that the database server has an average service time for each request of 50ms (s = 0.05 second). We vary the number of requests per second <lambda> and put it on the X axis. The Y axis has the perceived response time (tq).
We see in Figure 1 that the curve is almost flat and then rises suddenly. This happens when the utilization <rho> approaches 1. But, there is more to it than just slow response time when a server becomes more and more busy. RAM becomes a problem as soon there is an important peak in the number of requests.
For the sake of simplicity within queuing theory, we usually assume that the waiting queue has an infinite size. In practice, that is where ugly things can happen: queues do not have an infinite size because memory is not infinite. Whether you have 1GB or 8GB of RAM, you eventually run out of it.
The next graph (Figure 2) is similar to the first one, but on the Y axis we put the number of requests still in the server (q).
What happens is that when the number of requests coming in reaches the point where the utilization <rho> is almost 1, the number of requests waiting in RAM grows toward infinity. If you have 1,000 requests in RAM and each takes only 100KB (that is roughly 100MB), you are probably okay. With s = 0.05 second (50ms), q = 1000 when you have 71,928 requests per hour. When you have 71,993 requests per hour, you will have 10,284 requests waiting in RAM. That's about 1GB RAM. Let's say you have 2GB--with six more requests per hour, 71,998, you get q = 35,999. That's about 3.5GB. Theoretically, your users would have received a response after 30 minutes (1,800 seconds). But very few will get any response because the machine is swapping to get some virtual memory. Since this increases the service time by a factor of 100 or 1,000, your server appears to the world to have died.
There is a human factor that makes such response time crises even worse. The users will not wait patiently for 30 seconds. They will click Stop or press Esc and resubmit their requests. This will only add more requests to the system.
Table 1 shows the increase of q and tq for a server with a 50ms service time under increasing loads.
But, what would have happened if the server had used 1MB or 2MB per concurrent request? Table 2 shows also the usage of RAM if the per request footprint is 100KB, 500KB, 1MB or 2MB. If all requests really had a service near the average of 50ms (if the distribution was truly exponential), nothing much would happen. The server would start swapping a few seconds earlier perhaps. That makes little difference. The curve only becomes steep near the saturation point when the utilization <rho> approaches 1. But, there are scenarios (see the section Why the Server Crashes, Even at Moderate Loads) where reality does not follow the theoretical model. In those cases, RAM usage does make a difference.
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Roll your own dynamic dns
4 hours 41 min ago
- Please correct the URL for Salt Stack's web site
7 hours 52 min ago
- Android is Linux -- why no better inter-operation
10 hours 7 min ago
- Connecting Android device to desktop Linux via USB
10 hours 36 min ago
- Find new cell phone and tablet pc
11 hours 34 min ago
13 hours 3 min ago
- Automatically updating Guest Additions
14 hours 11 min ago
- I like your topic on android
14 hours 58 min ago
- This is the easiest tutorial
21 hours 34 min ago
- Ahh, the Koolaid.
1 day 3 hours ago