Three Ways to Web Server Concurrency
Multiprocessing, multithreading and evented I/O: the trade-offs in Web servers.
A Web server needs to support concurrency. The server should service clients in a timely, fair manner to ensure that no client starves because some other client causes the server to hang. Multiprocessing and multithreading, and hybrids of these, are traditional ways to achieve concurrency. Node.js represents another way, one based on system libraries for asynchronous I/O, such as epoll (Linux) and kqueue (FreeBSD). To highlight the trade-offs among the approaches, I have three echo servers written in close-to-the-metal C: a forking_server, a threading_server and a polling_server.
Shared Code
The Web servers use utils.c (Listing 1). The function
error_msg prints
messages and optionally terminates the server;
announce_client dumps
information about a connection; and
generate_echo_response creates a
syntactically correct HTTP response.
Listing 1. utils.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <fcntl.h>
#include "utils.h"
void error_msg(const char* msg, bool halt_flag) {
perror(msg);
if (halt_flag) exit(-1);
}
/* listening socket */
int create_server_socket(bool non_blocking) {
/* Modify as needed. */
const int port = 3000;
struct sockaddr_in server_addr;
/* create, bind, listen */
int sock = socket(AF_INET, /* family */
SOCK_STREAM, /* TCP */
0);
if (socket < 0) error_msg("Problem with socket call", true);
/* non-blocking? */
if (non_blocking) fcntl(sock, F_SETFL, O_NONBLOCK);
/* bind */
bzero(&server_addr, sizeof(server_addr));
server_addr.sin_family = AF_INET;
server_addr.sin_addr.s_addr = INADDR_ANY;
server_addr.sin_port = htons(port); /* host to network endian */
if (bind(sock, (struct sockaddr*) &server_addr,
↪sizeof(server_addr)) < 0)
error_msg("Problem with bind call", true);
/* listen */
fprintf(stderr, "Listening for requests on port %i...\n", port);
if (listen(sock, BACKLOG) < 0)
error_msg("Problem with listen call", true);
return sock;
}
void announce_client(struct in_addr* addr) {
char buffer[BUFF_SIZE + 1];
inet_ntop(AF_INET, addr, buffer, sizeof(buffer));
fprintf(stderr, "Client connected from %s...\n", buffer);
}
void generate_echo_response(char request[ ], char response[ ]) {
strcpy(response, "HTTP/1.1 200 OK\n");
strcat(response, "Content-Type: text/*\n");
strcat(response, "Accept-Ranges: bytes\n");
strcat(response, "Connection: close\n\n");
strcat(response, request);
}
The central function is create_server_socket, which creates a blocking or a
nonblocking listening socket. This function invokes three system
functions:
-
socket— create socket. -
bind— set port. -
listen— await connections.
The first call creates a TCP-based socket, and the bind call then specifies
the port number at which the Web server awaits connections. The listen
call starts the waiting for up to BACKLOG connections:
if (listen(sock, BACKLOG) < 0) /* BACKLOG == 12 */
error_msg("Problem with listen call", true);
A Multiprocessing Server
The forking_server in Listing 2 supports concurrency through multiprocessing, an approach that early Web servers, such as Apache 1 used to launch Web applications written as, for example, C or Perl scripts. Separate processes handle separate connections. This approach is hardly outdated, although modern servers, such as Apache 2, typically combine multiprocessing and multithreading.
Listing 2. forking_server.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netinet/in.h>
#include <signal.h>
#include "utils.h"
int main() {
/* Avoid zombies. */
signal(SIGCHLD, SIG_IGN);
char buffer[BUFF_SIZE + 1];
struct sockaddr_in client_addr;
socklen_t len = sizeof(struct sockaddr_in);
/* listening socket */
int sock = create_server_socket(false);
/* connections + requests */
while (true) {
int client = accept(sock,
(struct sockaddr*) &client_addr,
&len);
if (client < 0) error_msg("Problem with accept call", true);
announce_client(&client_addr.sin_addr);
/* fork child */
pid_t pid = fork();
if (pid < 0) error_msg("Problem with fork call", false);
/* 0 to child, child's PID to parent */
if (0 == pid) { /** child **/
close(sock); /* child's listening socket */
/* request */
bzero(buffer, sizeof(buffer));
int bytes_read = recv(client, buffer, sizeof(buffer), 0);
if (bytes_read < 0) error_msg("Problem with
↪recv call", false);
/* response */
char response[BUFF_SIZE * 2];
bzero(response, sizeof(response));
generate_echo_response(buffer, response);
int bytes_written = send(client, response,
↪strlen(response), 0);
if (bytes_written < 0) error_msg("Problem with
↪send call", false);
close(client);
exit(0); /* terminate */
}
else /** parent **/
close(client); /* parent's read/write socket. */
}
return 0;
}
The forking_server divides the labor among a parent process and as many
child processes as there are connected clients. A client is active until
the connection closes, which ends the session.
The parent process executes main from the first instruction. The parent
listens for connections and per connection:
-
Spawns a new process to handle the connection.
-
Resumes listening for other connections.
The following is the critical code segment:
pid_t pid = fork(); /* spawn child */
if (0 == pid) { /* child */
close(sock); /* close inherited listening socket */
/* handle request and terminate */
...
}
else /* parent */
close(client); /* close client, resume listening */
The parent executes the call to fork. If the call succeeds, fork returns a non-negative integer: 0 to the forked child process and the child's process identifier to the parent. The child inherits the parent's open socket descriptors, which explains the if-else construct:
-
if clause: the child closes its copy of the listening socket because accepting clients is the parent's job. The child handles the client's request and then terminates with a call to exit.
-
else clause: the parent closes the client socket because a forked child handles the client. The parent resumes listening for connections.
Creating and destroying processes are expensive. Modules, such as FastCGI, remedy this inefficiency through pre-forking. At startup, FastCGI creates a pool of reusable client-handling processes.
An inefficiency remains, however. When one process preempts another, a context switch occurs with the resultant system working to ensure that the switched-in and switched-out process behaves properly. The kernel maintains per-process context information so that a preempted process can restart. The context's three main structures are:
-
The page table: maps virtual addresses to physical ones.
-
The process table: stores vital information.
-
The file table: tracks the process' open files.
The CPU cycles that the system spends on context switches cannot be spent on applications such as Web servers. Although measuring the latency of a context switch is nontrivial, 5ms–10ms per switch is a ballpark and even optimistic range. Pre-forking mitigates the inefficiency of process creation and destruction but does not eliminate context switching.
What good is multiprocessing? The process structure frees the programmer
from synchronizing concurrent access to shared memory locations. Imagine,
for example, a Web application that lets a user play a simple word
game. The application displays scrambled letters, such as
kcddoe
, and
a player tries to unscramble the letters to form a word—in this case
docked
. This is a single-player game, and the application must track
the game's state: the string to be unscrambled, the player's movement of
letters one at a time and on so. Suppose that there is a global variable:
typedef struct {
/* variables to track game's state */
} WordGame;
WordGame game; /* single, global instance */
so that the application has, in the source code, exactly one WordGame instance visible across the application's functions (for example, move_letter, submit_guess). Concurrent players need separate WordGames, so that one player cannot interfere with another's game.
Now, consider two child processes C1 and C2, each handling a player. Under Linux, a forked child inherits its parent's address space; hence, C1 and C2 begin with the same address space as each other and their parent. If none of the three processes thereafter executes a write operation, no harm results. The need for separate address spaces first arises with write operations. Linux thus enforces a copy-on-write (COW) policy with respect to memory pages. If C1 or C2 performs a write operation on an inherited page, this child gets its own copy of the page, and the child's page table is updated. In effect, therefore, the parent and the forked children have separate address spaces, and each client effectively has its own copy of the writable WordGame.
Martin Kalin is a professor at the College of Computing and Digital Media at DePaul University, Chicago, Illinois. He earned his PhD at Northwestern University. Martin has co-authored various books on C and C++, authored a book on Java and, most recent
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Web Development News
Developer Poll
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Trying to Tame the Tablet
- RSS Feeds
- New Products
- What's the tweeting protocol?
- Dart: a New Web Programming Experience
- Reply to comment | Linux Journal
1 hour 34 min ago - Drupal is an Awesome CMS and a Crappy development framework
6 hours 13 min ago - IT industry leaders
8 hours 35 min ago - Reply to comment | Linux Journal
1 day 1 hour ago - Reply to comment | Linux Journal
1 day 3 hours ago - Reply to comment | Linux Journal
1 day 5 hours ago - great post
1 day 5 hours ago - Google Docs
1 day 6 hours ago - Reply to comment | Linux Journal
1 day 10 hours ago - Reply to comment | Linux Journal
1 day 11 hours ago







Comments
Gevent
Twisted and Node.js have developed the "polling" approach and real life examples always outperform.
Programming is different but complexity now dealt with by libraries such as gevent.
I can't imagine when a process driven server would be preferred unless there was no chance of requiring scale or performance.
Function
The primary function of a web server is to deliver web pages on the request to clients using the Hypertext Transfer Protocol (HTTP). This means delivery of HTML documents and any additional content that may be included by a document, such as images, style sheets and scripts.
A user agent, commonly a web browser or web crawler, initiates communication by making a request for a specific resource using HTTP and the server responds with the content of that resource or an error message if unable to do so. The resource is typically a real file on the server's secondary memory, but this is not necessarily the case and depends on how the web server is implemented.
Aménagement Sur Mesure
context switch takes 5ms
Your claim that a context switch takes 5 to 10ms (on the optimistic size), can you please enlighten us on the details of your computer.
Though it is one of those things difficult to quantify in all circumstances, it can take anywhere from few hundred nano seconds to few thousand micro seconds in general.
http://www.cs.rochester.edu/u/cli/research/switch.pdf
Great information,hope it
Great information,hope it works.
http://www.decoralamerica.com