Shielded CPUs: Real-Time Performance in Standard Linux
In a multiprocessor system, a shielded CPU is a CPU dedicated to the activities associated with high-priority real-time tasks. Marking a CPU as shielded allows CPU resources to be reserved for high-priority tasks. The execution environment of a shielded CPU provides the predictability required for supporting real-time applications. In other words, a shielded CPU makes it possible to guarantee rapid response to external interrupts and to provide a more deterministic environment for executing real-time tasks.
In the past, a shielded CPU could be created only on symmetric multiprocessing systems. With the advent of hyperthreading (where a single CPU chip has more than one logical CPU), even a uniprocessor can be configured to have a shielded CPU.
The shielded CPU approach to providing high-end real-time performance allows the developer of a real-time application to achieve results comparable to the results achieved using a small real-time executive. For example, the results compare to approaches such as RTAI or RT/Linux, where Linux is run as one process under a real-time executive. The advantages of using a pure Linux environment for application development as opposed to one of these executives are many. For example, Linux has support for many device drivers, lowering the overall cost of implementing a complete application solution. A wide variety of high-level languages for better programming efficiency is supported. This is important for commercial applications; programming efficiency may not be central to the design of the real-time system, but it is helpful during the development phase and can provide additional functionality in the end system. Furthermore, Linux offers complex protocol stacks such as CORBA, extensive graphics capabilities and advanced application development tools.
Besides all of the functionality available in standard Linux today, an ever-expanding list of features is being developed for the Linux operating system, due to the strong momentum of the Linux phenomenon. By using Linux as the basis for an application design, a user will have many more options in the future.
A real-time application is one that must respond to a real-world event and complete some processing task by a given deadline. A correct answer delivered after the deadline becomes an incorrect answer. The deadlines themselves are application-dependent and can vary from tens of microseconds up to several seconds. For hard real-time applications, no deadlines can be missed. This means that worst-case measurements of system metrics are the only thing that matters to a hard real-time application, because these are the cases that cause a missed deadline.
Because the occurrence of a real-world event is communicated to a computer system by way of an interrupt, a real-time operating system must provide guaranteed worst-case interrupt response time. In responding to an interrupt and giving control to the real-time application, the computer system has performed the first step needed to meet the deadline. Once the real-time application is running, the system also must provide the application with deterministic execution times. If the time it takes to execute the code associated with a real-time application's response varies widely, deadlines are missed.
To guarantee good interrupt response, the operating system must be able to preempt quickly any tasks currently executing when an interrupt occurs. Because the 2.4 Linux series does not allow one task to preempt the execution of another task executing inside the kernel, a kernel based on this series has poor worst-case interrupt response. A preemption patch is available to make a task executing within the kernel preemptible. Even in a Linux kernel that has the preemption patch installed, however, a hidden problem exists that still causes long interrupt response delays.
The job of any operating system is to coordinate the execution of the many tasks sharing the resources of the system. The data structures that describe these shared resources can be corrupted if they are accessed by multiple tasks at the same time. Therefore, all operating systems have critical sections of code that can be accessed only by tasks in a sequential fashion. When a high-priority task suddenly becomes runnable—because an interrupt occurred—that task cannot take control of the CPU if another task currently is executing inside of one of these critical sections. This means that long, critical sections have a big impact on the ability of the system to respond to an interrupt. The low-latency patches address some of the longer critical sections in the Linux kernel by making algorithmic changes that shorten the critical sections.
In general, the more complex a subsystem is, the longer the critical sections. Because Linux supports many such complex subsystems, including the filesystems and networking and graphics subsystems, its critical sections are very long compared to the critical sections in a small real-time OS. The preemption patch and the low-latency patches have improved the responsiveness of Linux greatly. Still, many critical sections can last tens of milliseconds—not acceptable for the deadlines required by many real-time applications.