Fun Little Tidbits in a Howling Storm (Re: Intel Security Holes)

Some kernel developers recently have been trying to work around the massive, horrifying, long-term security holes that have recently been discovered in Intel hardware. In the course of doing so, there were some interesting comments about coding practices.

Christoph Hellwig and Jesper Dangaard Brouer were working on mitigating some of the giant speed sacrifices needed to avoid Intel's gaping security holes. And, Christoph said that one such patch would increase the networking throughput from 7.5 million packets per second to 9.5 million—a 25% speedup.

To do this, the patch would check the kernel's "fast path" for any instances of dma_direct_ops and replace them with a simple direct call.

Linus Torvalds liked the code, but he noticed that Jesper and Christoph's code sometimes would perform certain tests before testing the fast path. But if the kernel actually were taking the fast path, those tests would not be needed. Linus said, "you made the fast case unnecessarily slow."

He suggested that switching the order of the tests would fix it right up. He added:

In fact, as a further micro-optimization, it might be a good idea to just specify that the dma_is_direct() ops is a special pointer (perhaps even just say that "NULL means it's direct"), because that then makes the fast-case test much simpler (avoids a whole nasty constant load, and testing for NULL in particular is often much better).

But that further micro-optimization absolutely *requires* that the ops pointer test comes first. So making that ordering change is not only "better code generation for the fast case to avoid extra cache accesses", it also allows future optimizations.

Regarding Linus' micro-optimization, Christoph explained:

I wanted to do the NULL case, and it would be much nicer. But the arm folks went to great lengths to make sure they don't have a default set of dma ops and require it to be explicitly set on every device to catch cases where people don't set things up properly, and I didn't want to piss them off....But maybe I should just go for it and see who screams, as the benefit is pretty obvious.

Linus also suggested that for Christoph's and Jesper's tests, the dma_is_direct() function should be sure to use the likely() call. And this was interesting because likely() is used to alert the compiler that a block of code is more "likely" to be run than another in order to optimize it. And, Christoph wasn't sure this was true. He said, "Yes, for the common case, it is likely. But if you run a setup where you say always have an iommu, it is not, in fact, it is never called in that case, but we only know that at runtime."

So Christoph was concerned about misleading the compiler and generating worse code. But Linus explained:

Note that "likely()" doesn't have any really huge overhead—it just makes the compiler move the unlikely case out-of-line.

Compared to the overhead of the indirect branch, it's simply not a huge deal, it's more a mispredict and cache layout issue.

So marking something "likely()" when it isn't doesn't really penalize things too much. It's not like an exception or anything like that, it's really just a marker for better code layout.

And that was it. Helpful hints in a time of desperate sorrow. These Intel hardware security holes are almost beyond belief. And we keep hearing about new batches of them being discovered all the time, or new exploits that require different workarounds from the ones already in place.

I'm sure Intel is working like mad to address all of this in future generations of its hardware. But the thing about security holes is that they are, by definition, hard to discover. Hardware manufacturers can poke and prod their products all they please and still miss the thing that a lone actor out in the world discovers one day by mistake. This time, it was Intel; next time, it'll be something else. Kudos to Intel for working with the OS people in spite of the public embarrassment to find good workarounds for these problems.

Note: if you're mentioned above and want to post a response above the comment section, send a message with your response text to

Zack Brown is a tech journalist at Linux Journal and Linux Magazine, and is a former author of the "Kernel Traffic" weekly newsletter and the "Learn Plover" stenographic typing tutorials. He first installed Slackware Linux in 1993 on his 386 with 8 megs of RAM and had his mind permanently blown by the Open Source community. He is the inventor of the Crumble pure strategy board game, which you can make yourself with a few pieces of cardboard. He also enjoys writing fiction, attempting animation, reforming Labanotation, designing and sewing his own clothes, learning French and spending time with friends'n'family.

Load Disqus comments