SMP and Embedded Real Time

As embedded real-time applications start to run on SMP systems, kernel issues emerge.
Myth 3: Real Time Must Be Either Hard or Soft

There is hard real time, which offers unconditional guarantees, and there is soft real time, which does not. What else do you need to know?

As it turns out, quite a bit. There are at least four different definitions of hard real time. Needless to say, it is important to understand which one your users have in mind.

In one definition of hard real time, the system always must meet its deadlines. However, if you show me a hard real-time system, I will show you the hammer that will cause it to miss its deadlines, as shown in Figure 2.

Figure 2. Hard Real Time: But I Have a Hammer

Of course, this is unfair. After all, we cannot blame software for hardware failures that it did not cause. Therefore, in another definition of hard real time, the system always must meet its deadlines, but only in absence of hardware failure. This divide-and-conquer approach can simplify things, but, as shown in Figure 3, it is not sufficient at the system level. Nonetheless, this definition can be useful given restrictions on the environment, including:

  1. Interrupt rates.

  2. Cache misses.

  3. Memory-system overhead due to DMA.

  4. Memory-error rate in ECC-protected systems.

  5. Packet-loss rate in systems requiring networking.

Figure 3. Hard real time: sometimes system failure is not an option!

If these restrictions are violated, the system is permitted to miss its deadlines. For example, if a hyperactive interrupt system delivered an interrupt after each instruction, the appropriate action might be to replace the broken hardware rather than code around it. After all, if this degenerate situation must be accounted for, the latencies will likely become uselessly long. Alternatively, “diamond hard” real-time operating systems and applications might run with interrupts disabled, giving up compatibility with off-the-shelf software in order to gain additional robustness in face of hardware failure.

In yet another definition of hard real time, the system is allowed to miss its deadline, but only if it announces its failure within the deadline specified. This sort of definition can be useful in data-fusion applications. For example, a system might have a high-precision location sensor with unpredictable processing overhead as well as a rough-and-ready location sensor with deterministic processing overhead. A reasonable hard real-time strategy would be to give the high-precision sensor a fixed amount of time to get its job done, and if it fails to do so, abort its calculation, relying instead on the rough-and-ready sensor. However, one (useless) way to meet the letter of this definition would be to announce failure unconditionally, as illustrated by Figure 4. Clearly, a useful system almost always would complete its work in time (and this observation applies to soft real-time systems as well).

Figure 4. Hard real time: at least I let you know!

Finally, some define hard real time with a test suite: a system passing the test is labeled hard real time. Purists might object, demanding instead a mathematical proof. However, given that proofs can be subject to error, especially for today's complex systems, a test suite can be an excellent additional proof point. I certainly do not wish to put my life at the mercy of untested software!

This is not to say that hard real time is undefined or useless. Instead, “hard real time” is the start of a conversation rather than a complete requirement. You should ask the following questions:

  1. Which operations must provide hard real-time response? (For example, I have yet to run across a requirement for real-time filesystem unmounting.)

  2. What is the deadline? A ten-millisecond deadline is one thing; a one-microsecond deadline is quite another.

  3. What is to happen in case of hardware failure?

  4. What is the required probability of meeting that deadline? (For hard real time, this will be 100%.)

  5. What degradation of non-real-time performance, throughput and scalability can be tolerated?

One piece of good news is that real-time deadlines that once required extreme measures are now easily met with off-the-shelf hardware and open-source software, courtesy of Moore's Law.

But, what if your real-time application is to run on an embedded multicore/multithreaded system? How can you deal with both real-time deadlines and parallel programming?

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

a question

hpare's picture

I have a question about that "interrupt" discribed in figure 6-8.
Could you tell me if this kind of interrupt happens on one CPU, from cpu catch a INTn do tophalf instructions to deal with the blue rectangle(maybe a softirq() of bottomhalf),do all of these was executed by one CPU?
waiting for your explanation!
thank you!

Threaded interrupts

paulmck's picture

There is a small portion of code that happens in the "top half", or hard irq context. On a non-PREEMPT_RT system he actual interrupt handler code would also execute in hard irq context. However, in PREEMPT_RT, the handler instead executes at process level in a kernel thread executing at real-time priority.

If this handler uses a bottom half, or softirq, then the softirq will be scheduled as another kernel thread, also executing at real-time priority.

The softirq interface is such that the softirq handler executes on the same CPU where the raise_softirq() request ran, Normally the system would be configured so that the hard irq and irq handler ran on the same CPU as well. (I believe that it can be configured otherwise, but I don't know of a good reason to do so.)

Great article, really interesting stuff

Adeel's picture

In addition, there are real-time audio systems, SIP servers and object brokers...

Can you give an example of rt audio/sip/object broker software/projects?

Also, has the -rt patch set had any impact on networking in linux? e.g. latency, iptables traversal time, etc

Would a standard program, e.g. X11, have a performance benefit on -rt compared to a non-rt system?

Examples of RT audio, SIP, object brokers...

paulmck's picture

There are a number of open-source audio projects. Two that come to mind immediately are Jack and Pulse audio, both of which were enthusiastic about testing out the -rt patchset. The only RT SIP servers that I am aware of are proprietary, ditto with object brokers.

There has been some effect of -rt on networking, but many real-time applications use lower-level protocols (such as UDP) or special transports (such as Infiniband) in order to retain greater control over latency. That said, there are special real-time protocols, such as the DDS suite.

Usually, real-time operating systems are designed for responsiveness, and usually give up throughput performance in favor of responsiveness. For one look at this issue, see my recent OLS paper on real time vs. real fast.

szkolenia

szkolenia biznesowe's picture

Nice article

thanks :)

Glad you liked it!

Paul E. McKenney's picture

;-)

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix