Working around Intel Hardware Flaws
Efforts to work around serious hardware flaws in Intel chips are ongoing. Nadav Amit posted a patch to improve compatibility mode with respect to Intel's Meltdown flaw. Compatibility mode is when the system emulates an older CPU in order to provide a runtime environment that supports an older piece of software that relies on the features of that CPU. The thing to be avoided is to emulate massive security holes created by hardware flaws in that older chip as well.
In this case, Linux is already protected from Meltdown by use of PTI (page table isolation), a patch that went into Linux 4.15 and that was subsequently backported all over the place. However, like the BKL (big kernel lock) in the old days, PTI is a heavy-weight solution, with a big impact on system speed. Any chance to disable it without reintroducing security holes is a chance worth exploring.
Nadav's patch was an attempt to do this. The goal was "to disable PTI selectively as long as x86-32 processes are running and to enable global pages throughout this time."
One problem that Nadav acknowledged was that since so many developers were actively working on anti-Meltdown and anti-Spectre patches, there was plenty of opportunity for one patch to step all over what another was trying to do. As a result, he said, "the patches are marked as an RFC since they (specifically the last one) do not coexist with Dave Hansen's enabling of global pages, and might have conflicts with Joerg's work on 32-bit."
Andrew Cooper remarked, chillingly:
Being 32bit is itself sufficient protection against Meltdown (as long as there is nothing interesting of the kernel's mapped below the 4G boundary). However, a 32bit compatibility process may try to attack with Spectre/SP2 to redirect speculation back into userspace, at which point (if successful) the pipeline will be speculating in 64bit mode, and Meltdown is back on the table. SMEP will block this attack vector, irrespective of other SP2 defenses the kernel may employ, but a fully SP2-defended kernel doesn't require SMEP to be safe in this case.
And Dave, nearby, remarked, "regardless of Meltdown/Spectre, SMEP is valuable. It's valuable to everything, compatibility-mode or not."
SMEP (Supervisor Mode Execution Protection) is a hardware mode, whereby the OS can set a register on compatible CPUs to prevent userspace code from running. Only code that already has root permissions can run when SMEP is activated.
Andy Lutomirski said that he didn't like Nadav's patch because he said it drew a distinction between "compatibility mode" tasks and "non-compatibility mode" tasks. Andy said no such distinction should be made, especially since it's not really clear how to make that distinction, and because the ramifications of getting it wrong might be to expose significant security holes.
Andy felt that a better solution would be to enable and disable 32-bit mode and 64-bit mode explicitly as needed, rather than guessing at what might or might not be compatibility mode.
The drawback to this approach, Andy said, was that old software would need to be upgraded to take advantage of it, whereas with Nadav's approach, the judgment would be made automatically and would not require old code to be updated.
Linus Torvalds was not optimistic about any of these ideas. He said, "I just feel this all is a nightmare. I can see how you would want to think that compatibility mode doesn't need PTI, but at the same time it feels like a really risky move to do this." He added, "I'm not seeing how you keep user mode from going from compatibility mode to L mode with just a far jump."
In other words, the whole patch, and any alternative, may just simply be a bad idea.
Nadav replied that with his patch, he tried to cover every conceivable case where someone might try to break out of compatibility mode and to re-enable PTI protections if that were to happen. Though he did acknowledge, "There is one corner case I did not cover (LAR) and Andy felt this scheme is too complicated. Unfortunately, I don't have a better scheme in mind."
Sure, I can see it working, but it's some really shady stuff, and now the scheduler needs to save/restore/check one more subtle bit.
And if you get it wrong, things will happily work, except you've now defeated PTI. But you'll never notice, because you won't be testing for it, and the only people who will are the black hats.
This is exactly the "security depends on it being in sync" thing that makes me go "eww" about the whole model. Get one thing wrong, and you'll blow all the PTI code out of the water.
So now you tried to optimize one small case that most people won't use, but the downside is that you may make all our PTI work (and all the overhead for all the _normal_ cases) pointless.
And Andy also remarked, "There's also the fact that, if this stuff goes in, we'll be encouraging people to deploy 32-bit binaries. Then they'll buy Meltdown-fixed CPUs (or AMD CPUs!) and they may well continue running 32-bit binaries. Sigh. I'm not totally a fan of this."
The whole thread ended inconclusively, with Nadav unsure whether folks wanted a new version of his patch.
The bottom line seems to be that Linux has currently protected itself from Intel's hardware flaws, but at a cost of perhaps 5% to 30% efficiency (the real numbers depend on how you use your system). And although it will be complex and painful, there is a very strong incentive to improve efficiency by adding subtler and more complicated workarounds that avoid the heavy-handed approach of the PTI patch. Ultimately, Linux will certainly develop a smooth, near-optimal approach to Meltdown and Spectre, and probably do away with PTI entirely, just as it did away with the BKL in the past. Until then, we're in for some very ugly and controversial patches.
Note: If you're mentioned above and want to post a response above the comment section, send a message with your response text to firstname.lastname@example.org.