Support for Persistent Memory
Persistent memory is still sort of a specialty item in Linux—RAM that retains its state across boots. Dave Hansen recently remarked that it was a sorry state of affairs that user applications couldn't simply use persistent memory by default. They had to be specially coded to recognize and take advantage of it. Dave wanted the system to treat persistent memory as just regular old memory.
His solution was to write a new driver that would act as a conduit between the kernel and any available persistent memory devices, managing them like any other RAM chip on the system.
Jeff Moyer was skeptical. He pointed out that in 2018, Intel had announced memory modes for its Optane non-volatile memory. Memory modes would allow the system to access persistent memory as regular memory—apparently exactly what Dave was talking about.
But Keith Busch pointed out that Optane memory modes were architecture-specific, for Intel's Optane hardware, while Dave's code was generic, for any devices containing persistent memory.
Jeff accepted the correction, but he still pointed out that persistent memory was necessarily slower than regular RAM. If the goal of Dave's patch was to make persistent memory available to user code without modifying that code, then how would the kernel decide to give fast RAM or slow persistent memory to the user software? That would seem to be a crucial question, he said.
Keith replied that faster RAM would generally be given preference over the slower persistent memory. The goal was to have the slower memory available if needed.
Dave also remarked that Intel's memory mode was wonderful! He had no criticism of it, and he said there were plenty of advantages to using memory mode instead of his patches. But he, also felt that the patches were essentially complementary, and they could be used side by side on systems that supported memory mode.
He also added:
Here are a few reasons you might want this instead of memory mode:
1. Memory mode is all-or-nothing. Either 100% of your persistent memory is used for memory mode, or nothing is. With this set, you can (theoretically) have very granular (128MB) assignment of PMEM to either volatile or persistent uses. We have a few practical matters to fix to get us down to that 128MB value, but we can get there.
2. The capacity of memory mode is the size of your persistent memory. DRAM capacity is "lost" because it is used for cache. With this, you get PMEM+DRAM capacity for memory.
3. DRAM acts as a cache with memory mode, and caches can lead to unpredictable latencies. Since memory mode is all-or-nothing, your entire memory space is exposed to these unpredictable latencies. This solution lets you guarantee DRAM latencies if you need them.
4. The new "tier" of memory is exposed to software. That means that you can build tiered applications or infrastructure. A cloud provider could sell cheaper VMs that use more PMEM and more expensive ones that use DRAM. That's impossible with memory mode.
The discussion petered out inconclusively, but something like this patch inevitably will go into the kernel. System resources are becoming very diverse these days. The idea of hooking up a bunch of wonky hardware and expecting reasonable behavior is starting to be more and more of a serious idea. It all seems to be leading toward a more open-sourcey idea of the Internet of Things—a world where your phone and your laptop and your car and the chip in your head are all parts of a single general-purpose Linux system that hotplugs and unplugs elements based on availability in the moment, rather than the specific proprietary concepts of the companies selling the products.
Note: if you're mentioned above and want to post a response above the comment section, send a message with your response text to [email protected]