diff -u: What's New in Kernel Development

When you run a program as setuid, it runs with all the permissions of that user. And if the program spawns new processes, they inherit the same permissions. Not so with filesystem capabilities. When you run a program with a set of capabilities, the processes it spawns do not have those capabilities by default; they must be given explicitly.

This seemed unintuitive to Christoph Lameter, who posted a patch to change capability inheritance to match the behavior of setuid inheritance. This turned out to inspire some controversy.

For one thing, filesystem capabilities never were defined fully in the POSIX standard and appear only in a draft version of POSIX that later was withdrawn, so there can't really be any discussion of whether one form of capabilities is "more compliant" than another.

There are other problems, such as the need to make sure that any changes to capabilities don't break existing code and the need to make sure that any ultimate solution remains secure.

One problem with Christoph's idea was that it tied capability inheritance to the file itself, but as Serge Hallyn pointed out, capabilities were tied to both the file and the user executing the file. Ultimately, Christoph decided to adapt his code to that constraint, introducing a new capability that would list the inheritable capabilities available for the user to apply to a given file.

Yalin Wang recently made an abortive effort to have /proc/stat list all CPUs on a given system, not just the on-line ones. This would be a very useful feature, because many modern systems bring CPUs on- and off-line at a rapid pace. Often the number of CPUs actually in use is less important than the number available to be used.

He posted a patch to change /proc/stat accordingly, and David Rientjes pointed out that the /sys/devices/cpu file would be a better location for this. Andrew Morton also pointed out that /proc/cpuinfo would be a good location for this kind of data as well. So, there definitely was some support for Yalin's idea.

Unfortunately, it turned out that some existing code in the Android kernel relied on the current behavior of those files—specifically desiring the number of on-line CPUs as opposed to the total number of CPUs on the system. With an existing user dependent on the existing behavior, it became a much harder sell to get the change into the kernel. Yalin would have to show a real need as opposed to just a sensible convenience, so his patch went nowhere.

John Stultz has been maintaining some timekeeping test patches on GitHub for several years now, and he finally wanted to get them into the kernel, so he could stop porting them forward continually. The test would do a variety of things, involving changing the system time in some way designed to induce a problem. He asked what he should do to make the patches acceptable to the kernel folks.

There were a bunch of generally supportive comments from folks like Richard Chochran and Shuah Khan, but Shuah requested some fairly invasive changes that would tie John's code to other testing code in the kernel. John said he'd be happy to do that if it was required, but that one of his goals had been to keep the test files isolated, so any one of them could run independently of anything else on the system. In light of that, Shuah withdrew her suggestion.

Overall, it's not a controversial set of patches, and they'll undoubtedly get into the kernel soon.

One problem with making backups that guarantee filesystem consistency is that files on the system may change while they're being backed up. There are various ways to prevent this, but if another process already has an open file descriptor for a file, backup software just has to wait or risk copying an inconsistent version of the file.

Namjae Jeon posted some patches to address this problem by implementing file freezing. This would allow backup software to block writes to a given file temporarily, even if that file already had been opened by another process.

In addition to backup software, other tools like defragmenting software would benefit from Namjae's patches by preventing any changes to a file that was being reorganized on disk.

As Jan Kara pointed out, however, Namjae's code had some potential race conditions as well as other technical problems. Dave Chinner described the code as "terribly racy".

It's not clear what will happen with these patches. They seem to offer features that folks want, but the race conditions need to be resolved, and the code needs to be clean and clear enough that future fixes and enhancements will not be too likely to introduce new problems.