Real-Time and Linux, Part 2: the Preemptible Kernel

Kevin continues his real-time series by examining efforts to bring real-time capabilities to applications by improving the Linux kernel.

In the January/February 2002 issue of Embedded Linux Journal, we examined the fundamental issues of real time with Linux. In this article we examine efforts to bring real-time capabilities to applications by making improvements to the Linux kernel. To date, the majority of this work has been to make the kernel more responsive--to reduce latency by reducing the preemption latency, which can be quite long in Linux.

By improving the kernel, and not changing or adding to the API, applications can run more responsively by merely switching out a standard kernel for the improved one. This is a big benefit. It means that ISVs need not create special versions for different real-time efforts. For example, DVD players may run more reliably on an improved kernel without needing to be aware that the kernel they are running on has been improved.

Background and History

With around Linux kernel release 2.2, the issue of kernel preemptibility began to get quite a lot of attention. Paul Barton-Davis and Benno Senoner, for example, wrote a letter (which in addition was signed by many others) to Linus Torvalds, asking that 2.4 please include significantly reduced preemption delays (

Their request was based on their desire to have Linux function well with audio, music and MIDI. Senoner produced some benchmarking software that demonstrated that the 2.2 kernel (and later the 2.4 kernel) had worst-case preemption latencies on the order of 100ms ( Latencies of this magnitude are unacceptable for audio applications. Conventional wisdom seems to say that latencies on the order of no more than a few milliseconds are required.

Two efforts emerged that produced patched kernels that provided quite reasonable preemption latencies. Ingo Molnar (of Red Hat) and Andrew Morton (then of The University of Wollongong) both produced patch sets that provided preemption within particularly long sections in the kernel. You can find Ingo Molnar's patches at, and you can find Andrew Morton's work at

In addition, Morton provides tools for measuring latencies, such as periods where the kernel ignores reschedule requests. His low-latency patches' web page, cited above, provides information on those as well.

Recently, at least two organizations have produced preemptible kernels that provide a more fundamental, and powerful, solution to the kernel preemptibility problem.

In the first article of this series in the January/February 2002 issue of ELJ, we listed several other desired features for real-time support in Linux, including increased number of priority levels, user-space interrupt handling and DMA, priority inheritance on synchronization mechanisms, microsecond time resolution, complete POSIX 1003.1b functionality and a constant time algorithm for scheduling. We will briefly comment on these as well.

A key point to remember with all of these improvements is that they involve patching the kernel. Anytime you patch a kernel you must assume that you no longer have binary compatibility for other kernel code, such as drivers. For example, the preemptible kernel approaches require modifying the code for spin locks. A binary driver won't employ this modification and thus may not prevent preemption properly. This emphasizes the need to have the source and recompile all kernel code. The Linux model for drivers is one of source-compatibility anyway. Distribution of binary-only drivers is discouraged for compatibility as well as for open-source philosophy reasons.


Various efforts that improve the kernel provide essentially transparent benefits. The efforts to improve the preemptibility of the kernel, be they through a preemptible kernel or through preemption points, result in a kernel that is more responsive to applications without any alterations in these applications.

Another aspect of transparency is whether the changes are transparent to the kernel, or in other words, do the approaches automatically track with changes in the kernel. The preemption point approaches of Molnar and Morton require that the scheduling latencies in new kernels be measured and preemption points placed in the proper places.

In contrast, the approaches to creating a preemptible kernel piggyback on the SMP locking and thus automatically transfer with new kernel versions. Also, by tying the preemptibility to the SMP-locking mechanism, as kernel developers improve the granularity of the SMP locking, the granularity of the preemption will improve automatically as well. We are likely to see steady improvement in SMP-locking granularity because improvement in this is required for improved SMP scaling.

It is because of this co-opting of the SMP locks that the preemptible kernel work depends upon a 2.4 or newer kernel. Prior kernels lacked the required SMP locks.

Another important benefit of the preemptible kernel approach to emphasize is that the approach makes code, which is otherwise unaware of it, preemptible. For example, driver writers need do nothing special to have their driver preemptible. Code in the driver will be preempted as required unless the driver holds a lock. Thus, as in other parts of the kernel, well-written drivers that are SMP-safe automatically will benefit from a preemptible kernel. On the other hand, drivers that are not SMP-safe may not function correctly with the preemptible kernels.

One should be aware, though, that just because one's driver does not request a lock, kernel code calling it may. For example, we found in a simple test with MontaVista's preemptible kernel that the functions read() and write() of a dynamically loaded driver were preempted just fine, while the functions init_module(), open() and close() were not. This means that if a low-priority process does an open() or close(), it may delay its preemption by a newly awoken high-priority process.

In practice, developers still should measure the latencies they are seeing. With the preemptible kernel approaches we see that it is still possible that a section of kernel code can hold a lock for a period longer than acceptable for one's application.

MontaVista, for example, provides a preemptible kernel, adds a few preemption points in sections where locks are held too long and provides measurement tools so that developers can measure the preemptibility performance with their actual applications and environment.

The goal of SMP locks is to ensure safe re-entrance into the kernel. That is, if processes running in parallel require kernel resources, access to these resources is done safely. The smaller the granularity of the locking, the greater the chance that competing processes can continue to execute in parallel. Parallelization is improved as the blocking (because of contention) is reduced.

This concept applies to uniprocessors as well, when I/O is considered. If one considers I/O devices as separate processors, then parallelization, or throughput, improves as applications and I/O activities can continue in parallel. Improvements in preemptibility, which imply that high-priority I/O-bound processes wake up more quickly, can thus improve throughput. Thus, somewhat paradoxically, we see that even though we may experience more context swaps and execute more code in critical kernel paths, we may still see greater system throughput.

The benefits of a preemptible kernel seem to be so clear that we can expect preemptibility eventually to be a standard feature of the Linux kernel. Preemptible kernels have been shown to reduce latencies to just a few milliseconds for some implementations and to as low as tens of microseconds in others.

In a quick survey of embedded Linux vendors, MontaVista and TimeSys provide preemptible kernels, REDSonic has preemption points, LynuxWorks and Red Hat use RTLinux. Lineo uses RTAI. OnCore provides Linux preemptibility both through a Linux system call-compatible API (as does LynuxWorks with LynxOS) and through running a Linux kernel (which effectively becomes preemptible) on top of their preemptible microkernel.