SMP and Embedded Real Time
Parallel programming might not be mind crushingly hard, but it is certainly harder than single-threaded programming. Real-time programming is also hard. So, why would anyone be crazy enough to take on both at the same time?
It is true that real-time parallel programming poses special challenges, including interactions with lock-induced delays, interrupt handlers and priority inversion. However, Ingo Molnar's -rt patchset provides both kernel and application developers with tools to deal with these challenges. These tools are described in the following sections.
Much ink has been spilled on locking and real-time latency, but we will stick to the following simple points:
Reducing lock contention improves SMP scalability and reduces real-time latency.
When lock contention is low, there are a finite number of tasks, critical-section execution time is bounded, and locks act in a first-come-first-served manner to the highest-priority tasks, then lock wait times for those tasks will be bounded.
An SMP Linux kernel by its very nature requires very few modifications in order to support the aggressive preemption required by real time.
The first point should be obvious, because spinning on locks is bad for both scalability and latency. For the second point, consider a queue at a bank where each person spends a bounded time T with a solitary teller, there are a bounded number of other people N, and the queue is first-come-first-served. Because there can be at most N people ahead of you, and each can take at most time T, you will wait for at most time NT. Therefore, FIFO priority-based locking really can provide hard real-time latencies.
For the third point, see Figure 5. The left-hand side of the diagram shows three functions A(), B() and C() executing on a pair of CPUs. If functions A() and B() must exclude function C(), some sort of locking scheme must be used. However, that same locking provides the protection needed by the -rt patchset's preemption, as shown on the right-hand side of this diagram. If function B() is preempted, function C() blocks as soon as it tries to acquire the lock, which permits B() to run. After B() completes, C() may acquire the lock and resume running.

Figure 5. SMP Locking and Preemption
This approach requires that kernel spinlocks block, and this change is fundamental to the -rt patchset. In addition, per-CPU variables must be protected more rigorously. Interestingly enough, the -rt patchset also located a number of SMP bugs that had gone undetected.
However, in the standard Linux kernel, interrupt handlers cannot block. But interrupt handlers must acquire locks, which can block in -rt. What can be done?
Not only are blocking locks a problem for interrupt handlers, but they also can seriously degrade real-time latency, as shown in Figure 6.

Figure 6. Interrupts Degrade Latency
This degradation can be avoided by running the interrupt handler in process context, as shown in Figure 7, which also allows them to acquire blocking locks.

Figure 7. Move Interrupt Handlers to Process Context
Even better, these process-based interrupt handlers can actually be preempted by user-level real-time threads, as shown in Figure 8, where the blue rectangle within the interrupt handler represents a high-priority real-time user process preempting the interrupt handler.

Figure 8. Preempting Interrupt Handlers
Of course, “with great power comes great responsibility.” For example, a high-priority real-time user process could starve interrupts entirely, shutting down all I/O. One way to handle this situation is to provide a low-priority “canary” process. If the “canary” is blocked for longer than a predetermined time, one might kill the offending thread.
Running interrupts in process context permits interrupt handlers to acquire blocking locks, which in turn allows critical sections to be preempted, which permits extremely fast real-time scheduling latencies. In addition, the -rt patchset permits real-time application developers to select the real-time priority at which interrupt handlers run. By running only the most critical portions of the real-time application at higher priority than the interrupt handlers, the developers can minimize the amount of code for which “great responsibility” must be shouldered.
However, preempting critical sections can lead to priority inversion, as described in the next section.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Nice article, thanks for the
4 hours 39 min ago - I once had a better way I
10 hours 25 min ago - Not only you I too assumed
10 hours 42 min ago - another very interesting
12 hours 35 min ago - Reply to comment | Linux Journal
14 hours 29 min ago - Reply to comment | Linux Journal
21 hours 23 min ago - Reply to comment | Linux Journal
21 hours 39 min ago - Favorite (and easily brute-forced) pw's
23 hours 30 min ago - Have you tried Boxen? It's a
1 day 5 hours ago - seo services in india
1 day 9 hours ago




Comments
a question
I have a question about that "interrupt" discribed in figure 6-8.
Could you tell me if this kind of interrupt happens on one CPU, from cpu catch a INTn do tophalf instructions to deal with the blue rectangle(maybe a softirq() of bottomhalf),do all of these was executed by one CPU?
waiting for your explanation!
thank you!
Threaded interrupts
There is a small portion of code that happens in the "top half", or hard irq context. On a non-PREEMPT_RT system he actual interrupt handler code would also execute in hard irq context. However, in PREEMPT_RT, the handler instead executes at process level in a kernel thread executing at real-time priority.
If this handler uses a bottom half, or softirq, then the softirq will be scheduled as another kernel thread, also executing at real-time priority.
The softirq interface is such that the softirq handler executes on the same CPU where the raise_softirq() request ran, Normally the system would be configured so that the hard irq and irq handler ran on the same CPU as well. (I believe that it can be configured otherwise, but I don't know of a good reason to do so.)
Great article, really interesting stuff
In addition, there are real-time audio systems, SIP servers and object brokers...
Can you give an example of rt audio/sip/object broker software/projects?
Also, has the -rt patch set had any impact on networking in linux? e.g. latency, iptables traversal time, etc
Would a standard program, e.g. X11, have a performance benefit on -rt compared to a non-rt system?
Examples of RT audio, SIP, object brokers...
There are a number of open-source audio projects. Two that come to mind immediately are Jack and Pulse audio, both of which were enthusiastic about testing out the -rt patchset. The only RT SIP servers that I am aware of are proprietary, ditto with object brokers.
There has been some effect of -rt on networking, but many real-time applications use lower-level protocols (such as UDP) or special transports (such as Infiniband) in order to retain greater control over latency. That said, there are special real-time protocols, such as the DDS suite.
Usually, real-time operating systems are designed for responsiveness, and usually give up throughput performance in favor of responsiveness. For one look at this issue, see my recent OLS paper on real time vs. real fast.
szkolenia
Nice article
thanks :)
Glad you liked it!
;-)