Linux Network Programming, Part 2
In Figure 1, the diagrams show three potential designs for a daemon providing a network service to prospective clients. In the first picture, the daemon follows the most common technique of forking off a separate process to handle the request, while the parent continues to accept new connection requests. This concurrent processing technique has the advantage that requests are constantly being serviced and may perform better than serializing and iteratively servicing requests. Unfortunately, forks and potential context-switches are involved, making this approach unsuited to servers with very high demand.
The second diagram shows the iterative, synchronous, accepting and handling of a request within a single context of execution, before another request is handled. This approach has the drawback that requests which occur during the handling of the request will either get blocked or rejected. If blocked, they will be blocked for at most the duration of the request processing and communication. Depending on this duration, a significant number of requests could potentially get rejected due to the listen queue backlog having filled. Therefore, this approach is perhaps best suited to handling requests of a very short duration. It is also better suited to UDP network daemons rather than TCP daemons.
The third diagram (Figure 1) is the most complicated—it shows a daemon which pre-allocates new contexts of execution (in this case, new processes) to handle the requests. Note that the master calls fork() after listen(), but before an accept() call. The slave processes call accept(). This scenario will leave a pool of potential server processes blocking an accept() call at the same time. However, the kernel guarantees that only one of the slaves will succeed in its accept() call for a given connection request. It will then service the request before returning to the accept state. The master process can either exit (with SIGCHLD being ignored) or continually call wait() to reap exiting slave processes.
It is quite common for the slave processes to accept only a certain number of requests before committing suicide to prevent memory-leaks from accumulating. The process with the lowest number of accepted requests (or perhaps a special manager parent) would then create new processes as necessary. Many popular web servers implement pools of pre-forked server threads (e.g., Netscape, Apache).
If the server process time of a request is very short (the usual case), concurrent processing is not always necessary. An iterative server may perform better by avoiding the overhead of context-switching. One hybrid solution between concurrent and iterative designs is to delay the allocation of new server processes. The server will begin processing requests iteratively. It will create a separate slave process to finish handling a request if the processing time for that request is substantial. Thus, a master process can check the validity of requests, or handle short requests, before creating a new slave.
To use delayed process allocation, use the alarm() system call, as shown in Listing 5. A timer is established in the master, and when the timer expires, a signal handler is called. A fork() system call is performed inside the handler. The parent closes the request connection and returns to an accepting state, whereas the child handles the request. The setjmp() system call records the state of the process's stack environment. When the longjmp() is later invoked, the process will be restored to exactly the same state as saved by the setjmp(). The second parameter to longjmp() is the value that setjmp() will return when the stack is restored.
All of the forking in these examples could be replaced with calls to pthread_create() to create a new thread of execution rather than a full heavyweight process. As mentioned previously, the threads should be kernel-level threads to ensure that a block on I/O in one thread does not starve others of CPU attention. This involves using Xavier Leroy's excellent kernel-level Linux Threads package (http://pauillac.inria.fr/~xleroy/linuxthreads/), which is based on the clone() system call.
Implementing with threads introduces more complications than using the fork() model. Granted, the use of threads gives great savings in context-switching time and memory usage. Other issues come into play, such as availability of file descriptors and protection of critical sections.
Most operating systems limit the number of open file descriptors a process is allowed to hold. Although the process can use getrlimit() and setrlimit() calls to increase this up to a system-wide maximum, this value is usually set to 256 by NOFILE in the /usr/include/sys/param.h file.
Even tweaking NOFILE and the values NR_OPEN and NR_FILE in the /usr/src/linux/include/linux/fs.h file and recompiling the kernel may not help here. While in Linux the fileno element of the FILE struct (actually called _fileno in Linux) is of type int, it is commonly unsigned char in other systems, limiting file descriptors to 255 for buffered I/O commands (fopen(), fprintf(), etc). This difference affects the portability of the finished application.
Because threads use a common memory space, care must be taken to ensure this space is always in a consistent state and does not get corrupted. This may involve serializing writes (and possibly reads) to shared data accessed by more than one thread (critical sections). This can be achieved by the use of locks, but care must be taken to avoid entering a state of deadlock.
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- RSS Feeds
- Validate an E-Mail Address with PHP, the Right Way
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Readers' Choice Awards
- The Secret Password Is...
- All the articles you talked
51 min 46 sec ago
- All the articles you talked
54 min 53 sec ago
- All the articles you talked
56 min 13 sec ago
5 hours 20 min ago
- Keeping track of IP address
7 hours 11 min ago
- Roll your own dynamic dns
12 hours 25 min ago
- Please correct the URL for Salt Stack's web site
15 hours 36 min ago
- Android is Linux -- why no better inter-operation
17 hours 52 min ago
- Connecting Android device to desktop Linux via USB
18 hours 20 min ago
- Find new cell phone and tablet pc
19 hours 18 min ago