SMP and Embedded Real Time
Parallel programming might not be mind crushingly hard, but it is certainly harder than single-threaded programming. Real-time programming is also hard. So, why would anyone be crazy enough to take on both at the same time?
It is true that real-time parallel programming poses special challenges, including interactions with lock-induced delays, interrupt handlers and priority inversion. However, Ingo Molnar's -rt patchset provides both kernel and application developers with tools to deal with these challenges. These tools are described in the following sections.
Much ink has been spilled on locking and real-time latency, but we will stick to the following simple points:
Reducing lock contention improves SMP scalability and reduces real-time latency.
When lock contention is low, there are a finite number of tasks, critical-section execution time is bounded, and locks act in a first-come-first-served manner to the highest-priority tasks, then lock wait times for those tasks will be bounded.
An SMP Linux kernel by its very nature requires very few modifications in order to support the aggressive preemption required by real time.
The first point should be obvious, because spinning on locks is bad for both scalability and latency. For the second point, consider a queue at a bank where each person spends a bounded time T with a solitary teller, there are a bounded number of other people N, and the queue is first-come-first-served. Because there can be at most N people ahead of you, and each can take at most time T, you will wait for at most time NT. Therefore, FIFO priority-based locking really can provide hard real-time latencies.
For the third point, see Figure 5. The left-hand side of the diagram shows three functions A(), B() and C() executing on a pair of CPUs. If functions A() and B() must exclude function C(), some sort of locking scheme must be used. However, that same locking provides the protection needed by the -rt patchset's preemption, as shown on the right-hand side of this diagram. If function B() is preempted, function C() blocks as soon as it tries to acquire the lock, which permits B() to run. After B() completes, C() may acquire the lock and resume running.
This approach requires that kernel spinlocks block, and this change is fundamental to the -rt patchset. In addition, per-CPU variables must be protected more rigorously. Interestingly enough, the -rt patchset also located a number of SMP bugs that had gone undetected.
However, in the standard Linux kernel, interrupt handlers cannot block. But interrupt handlers must acquire locks, which can block in -rt. What can be done?
Not only are blocking locks a problem for interrupt handlers, but they also can seriously degrade real-time latency, as shown in Figure 6.
This degradation can be avoided by running the interrupt handler in process context, as shown in Figure 7, which also allows them to acquire blocking locks.
Even better, these process-based interrupt handlers can actually be preempted by user-level real-time threads, as shown in Figure 8, where the blue rectangle within the interrupt handler represents a high-priority real-time user process preempting the interrupt handler.
Of course, “with great power comes great responsibility.” For example, a high-priority real-time user process could starve interrupts entirely, shutting down all I/O. One way to handle this situation is to provide a low-priority “canary” process. If the “canary” is blocked for longer than a predetermined time, one might kill the offending thread.
Running interrupts in process context permits interrupt handlers to acquire blocking locks, which in turn allows critical sections to be preempted, which permits extremely fast real-time scheduling latencies. In addition, the -rt patchset permits real-time application developers to select the real-time priority at which interrupt handlers run. By running only the most critical portions of the real-time application at higher priority than the interrupt handlers, the developers can minimize the amount of code for which “great responsibility” must be shouldered.
However, preempting critical sections can lead to priority inversion, as described in the next section.
- Resurrecting the Armadillo
- High-Availability Storage with HA-LVM
- Real-Time Rogue Wireless Access Point Detection with the Raspberry Pi
- March 2015 Issue of Linux Journal: System Administration
- DNSMasq, the Pint-Sized Super Dæmon!
- Localhost DNS Cache
- Days Between Dates: the Counting
- The Usability of GNOME
- Linux for Astronomers
- You're the Boss with UBOS