Serializing Web Application Requests
Web application servers are an extremely useful extension of the basic web server concept. Instead of presenting fairly simple static pages or the results of database queries, a complex application can be made available for access across the network. One problem with serving applications is that processing on the back end may take a significant amount of time and server resources—leading to slow response times or failures due to memory limitations when multiple users submit requests simultaneously.
There are essentially three basic strategies for handling web requests which cannot be satisfied immediately: ignore the issue, use unbuffered no-parsed-header (NPH) CGI code to emit “Processing” while the back end completes, or issue an immediate response which refers the user to a result page created upon job completion. In my experience, the first option is not effective. Without feedback, users invariably resubmit their requests thinking there was a failure in the submission. The redundant requests will exacerbate the problem if they aren't eliminated. To make matters worse, the number of these redundant requests will peak precisely at peak usage times. NPH CGI is most useful when the processing times are short and the server can handle many simultaneous instances of the application. It has the drawback that users must sit and wait for the processing to complete and cannot quickly refer back to the page. My preferred method is referral to a dynamic page, combined with a reliable method of serializing requests.
As an example, I will describe my use of Generic NQS (GNQS) (see http://www.shef.ac.uk/~nqs/ and http://www.gnqs.org) to perform serialization and duplicate job elimination in a robust fashion for a set of web application servers at the University of Washington Genome Center. GNQS is an Open Source queueing package available for Linux as well as a large number of other UNIX platforms. It was written primarily to optimize utilization of supercomputers and large server farms, but it is also useful on single machines as well. It is currently maintained by Stuart Herbert (S.Herbert@Sheffield.ac.uk).
At the genome center, we have developed a number of algorithms for the analysis of DNA sequence. Some of these algorithms are CPU- and memory-intensive and require access to large sequence databases. In addition to distributing the code, we have made several of these programs available via a web and e-mail server for scientists worldwide. Anyone with access to a browser can easily analyze their sequence without the need to have UNIX expertise on-site, and most importantly for our application, without maintaining a local copy of the database. Since the sequence databases are large and under continuing revision, maintaining copies can be a significant expense for small research institutions.
The site was initially implemented on a 200MHz Pentium pro with 128MB of memory, running Red Hat 4.2 and Apache, which was more than adequate for the bulk of the processing requests. Most submissions to our site could be processed in a few seconds, but when several large requests were made concurrently, response times became unacceptable. As the number of requests and data sizes increased, the server was frequently being overwhelmed. We considered reducing the maximum size problem that we would accept, but we knew that, as the Human Genome Project advanced, larger data sets would become increasingly common. After analyzing the usage logs, it became apparent that, during peak periods, people were submitting multiple copies of requests when the server didn't return results quickly. I was faced with this performance problem shortly after our web site went on-line.
Listing 1. Sample GNQS Commands
Instead of increasing the size of the web server, I felt that robust serialization would solve the problem. I installed GNQS version 3.50.2 on the server and wrote small extensions to the CGI scripts to queue the larger requests, instead of running them immediately. Instead of resorting to NPH CGI scripts which would lock up a user's web page for several minutes while the web server processed, I could write a temporary page containing a message that the server was still processing and instructions to reload the page later. By creating a name for the dynamic page from an md5 sum of the request parameters and data, I was able to completely eliminate the problem of multiple identical requests. Finally, all web requests were serialized in a single job queue, and an additional low priority queue was used for e-mail requests. It was a minor enhancement to allow requests submitted to the web server for responses via e-mail to simply be queued into the low priority e-mail queue. Consequently, processor utilization was increased and job contention was reduced.
While this proved quite effective from a machine utilization standpoint, the job queue would get so long during peak periods that users grew impatient. An additional enhancement was made which reported the queue length when the request was initially queued. This gave users a more accurate expectation about completion time. Additionally, when a queued job was resubmitted, the current position in the queue would now be displayed. These changes completely eliminated erroneous inquiries regarding the status of the web server.
After over a year of operation, we had an additional application to release and decided to migrate the server to a Linux/Alpha system running Red Hat 5.0. The switch to glibc exposed a bug in GNQS that was initially difficult to find. However, since the source code was available, I was able to find and fix the problem myself. I have since submitted the patch to Stuart for inclusion in the next release of GNQS and contributed a source RPM (ftp://ftp.redhat.com/pub/contrib/SRPMS/Generic-NQS-3.50.4-1.src.rpm) to the Red Hat FTP site.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- What's the tweeting protocol?
- New Products
- RSS Feeds
- Readers' Choice Awards
- Dart: a New Web Programming Experience
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




13 hours 39 min ago
16 hours 11 min ago
17 hours 28 min ago
18 hours 3 min ago
18 hours 26 min ago
23 hours 14 min ago
1 day 1 min ago
1 day 1 hour ago
1 day 3 hours ago
1 day 5 hours ago