diff -u: What's New in Kernel Development
David Herrmann wanted to disable the virtual terminal subsystem in order to save space on a kernel that didn't need a VT. But, he still wanted to see kernel oops output for debugging purposes. The problem was that only the VT subsystem would display oops output—and he'd just disabled it.
No problem. David posted a patch to implement DRM-log, a separate console device that used the direct rendering manager and that could receive kernel oops output.
Over the course of a discussion about the patch, Alan Cox mentioned that there didn't seem to be anything particularly DRM-specific in David's code. It easily could exist at a yet more generic layer of the kernel. And although David agreed with this, he said the DRM folks were more amenable to taking his patch and that "I've spent enough time trying to get the attention of core maintainers for simple fixes, I really don't want to waste my time pinging on feature-patches every 5 days to get any attention. If someone outside of DRM wants to use it, I'd be happy to discuss any code-sharing. Until then, I'd like to keep it here as people are willing to take it through their tree."
That's a fairly surprising statement—a bit of an indictment of existing kernel patch submission processes. There was no further discussion on that particular point, but I would imagine it got some folks thinking.
The rest of the current thread focused on some technical details about oops output, especially font size. David's code displayed oops output pixel by pixel, essentially defining its own font. But for extremely high-resolution monitors, such as Apple's Retina display, as Bruno Prémont pointed out, this could result in the oops output being too small for the user to see.
David's answer to this was to implement integer scaling. His font could be any integer multiple larger than the default. This seemed fine to Bruno.
Eugene Shatokhin posted some code to make use of Google's ThreadSanitizer. ThreadSanitizer detects a particular type of race condition that occurs when one thread tries to write to a variable while another thread either tries to read from or write to the same variable.
Eugene called his own code Kernel Strider. It collected statistics on memory accesses, function calls and other things, and sent them along to be analyzed by Thread Sanitizer. Eugene also posted a link to a page describing several race conditions that Kernel Strider had uncovered in the 3.10.x kernel series.
Waiman Long posted some code implementing qspinlock, a new type of spinlock that seemed to improve speed on very large multiprocessor systems. The idea behind the speed improvement was that a CPU would disable preemption when spinning for a lock, so it would save the time that might otherwise have been used migrating the looping thread to other CPUs.
The big problem with that kind of improvement is that it's very context-dependent. What's faster to one user may be slower to another, depending on one's particular usual load. Traditionally, there has been no clean way to resolve that issue, because there really is not any "standard" load under which to test the kernel. The developers just have to wing it.
But, they wing it pretty good, and ultimately things like new spinlock implementations do get sufficient testing to determine whether they'd be a real improvement. The problem with Waiman's situation, as he said on the list, is that the qspinlock implementation is actually slower than the existing alternatives on systems with only a few CPUs—in other words, for anyone using Linux at home.
However, as George Spelvin pointed out, the most common case is when a spinlock doesn't spin even once, but simply requests and receives the resource in question. And in that case, qspinlocks seem to be just as fast as the alternatives.
To qspinlock or not to qspinlock—Rik van Riel knew his answer and sent out his "Signed-Off-By" for Waiman's patch. Its merits undoubtedly will continue to be tested and debated. But there are many, many locking implementations in the kernel. I'm sure this one will be used somewhere, even if it's not used everywhere.
Yuyang Du recently suggested separating the Linux scheduler into two independent subsystems: one that performed load balancing between CPUs and the other that actually scheduled processes on each single CPU.
The goal was lofty. With the scheduler performing both tasks, it becomes terribly complex. By splitting it into these two halves, it might become possible to write alternative systems for one or the other half, without messing up the other.
But in fact, no. There was almost universal rejection of the idea. Peter Zijlstra said, "That's not ever going to happen." Morten Rasmussen said the two halves couldn't be separated the way Yuyang wanted—they were inextricably bound together.
You never know though. Once upon a time, someone said Linux never would support any architecture other than i386. Now it runs on anything that contains silicon, and there's undoubtedly an effort underway to port it to the human brain. Maybe the schedule can be split into two independent halves as well.
Limited Time Offer
Take Linux Journal for a test drive. Download our September issue for FREE.
Topic of the Week
The cloud has become synonymous with all things data storage. It additionally equates to the many web-centric services accessing that same back-end data storage, but the term also has evolved to mean so much more.