The Linux Process Model

A look at the fundamental building blocks of the Linux kernel.
Threaded Kernel

Linux kernel threading has constantly improved. Let's look at the different versions again:

  • 2.0.x had no kernel threading.

  • 2.2.x has kernel threading added.

  • 2.3.x is very SMP-threaded.

In 2.2.x, many places are still single-threaded, but 2.2.x kernels actually scale well only on two-way SMPs. In 2.2.x, the IRQ/timer handling (for example) is completely SMP-threaded, and the IRQ load is distributed across multiple CPUs.

In 2.3.x, most worthwhile code sections within the kernel are being rewritten for SMP threading. For example, all of the VM (virtual memory) is SMP-threaded. The most interesting paths now have a much finer granularity and scale very well.

Performance Limitations

For the sake of system stability, a kernel has to react well in stress situations. It must, for instance, reduce priorities and resources to processes that misbehave.

How does the scheduler handle a poorly written program looping tightly and forking at each turn of the loop (thereby forking off thousands of processes in a few seconds)? Obviously, the scheduler can't limit the creation of processes time-wise, e.g., a process every 0.5 seconds or similar.

After a fork, however, the “runtime priority” of the process is divided between the parent and the child. This means the parent/child will be penalized compared to the other tasks, and the other tasks will continue to run fine up to the first recalculation of the priorities. This keeps the system from stalling during a fork flooding. The code for this is the concerned code section in linux/kernel/fork.c:

/*
 "share" dynamic priority between parent
 * and child, thus the total amount of dynamic
 * priorities in the system doesn't change, more
 * scheduling fairness. This is only important
 * in the first time slice, in the long run the
 * scheduling behaviour is unchanged.
 */
current->counter >>= 1;
p->counter = current->counter;

Additionally, there is a per-user limit of threads that can be set from init before spawning the first user process. It can be set with  ulimit -u in bash. You can tell it that user moshe can run a maximum ten concurrent tasks (the count includes the shell and every process run by the user).

In Linux, the root user always retains some spare tasks for himself. So, if a user spawns tasks in a loop, the administrator can just log in and use the killall command to remove all tasks of the offending user. Due to the fact that the “runtime priority” of the task is divided between the parent and the child, the kernel reacts smoothly enough to handle this type of situation.

If you wanted to amend the kernel to allow only one fork per processor tick (usually one every 1/100th second; however, this parameter is tunable), called a jiffie, you would have to patch the kernel like this:

--- 2.3.26/kernel/fork.c        Thu Oct 28 22:30:51 1999
+++ /tmp/fork.c Tue Nov  9 01:34:36 1999
@@ -591,6 +591,14 @@
        int retval = -ENOMEM;
        struct task_struct *p;
        DECLARE_MUTEX_LOCKED(sem);
+       static long last_fork;
+
+       while (time_after(last_fork+1, jiffies))
+       {
+               __set_current_state(TASK_INTERRUPTIBLE);
+               schedule_timeout(1);
+       }
+       last_fork = jiffies;
        if (clone_flags & CLONE_PID) {
/* This is only allowed from the boot up thread */

This is the beauty of open source. If you don't like something, just change it!

Here ends the first part of our tour through the Linux kernel. In the next installment, we will take a more detailed look at how the scheduler works. I can promise you some surprising discoveries. Some of these discoveries caused me to revalue completely the probable impact of Linux on the corporate server market. Stay tuned.

Resources

email: moshe@moelabs.com

Moshe Bar (moshe@moelabs.com) is an Israeli system administrator and OS researcher, who started learning UNIX on a PDP-11 with AT&T UNIX Release 6 back in 1981. He holds an M.Sc. in computer science. Visit Moshe's web site at http://www.moelabs.com/.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState