diff -u: What's New in Kernel Development

One ongoing question kernel developers face is the best way to delete data so no one else can recover it. Typically there are simple tools to undelete files that are deleted accidentally, although some filesystems make this easier than others.

Alexander Holler wanted to make it much harder for anyone to recover deleted data. He didn't necessarily want to outwit the limitless resources of our governmental overlords, but he wanted to make data recovery harder for the average hostile attacker. The problem as he saw it was that filesystems often would not actually bother to delete data, so much as they would just decouple the data from the file and make that part of the disk available for use by other files. But the data would still be there, at least for a while, for anyone to recouple into a file again.

Alexander posted some patches to implement a new system call that first would overwrite all the data associated with a given file before making that disk space available for use by other files. Since the filesystem knew which blocks on the disk were associated with which files, he reasoned, zeroing out all relevant data would be a trivial operation.

There were various objections. Alan Cox pointed out that hard drives have become so smart these days that it's hard to know exactly what they're doing in response to a given command. As he put it, "If you zero a sector [the disk is] perfectly entitled to set a bit in a master index of zeroed sectors, and not bother actually zeroing the data at all." Alan said that the disk simply had to accept user inputs and return the correct outputs, and everything happening behind the curtain was entirely up to the hardware manufacturer.

Russ Dill pointed out that a lot of user programs also made it more difficult to know exactly where a file's data was on disk. The vim program, for example, created temporary backup files, as did many other programs.

There was not much support for Alexander's patch. But I imagine the ability to delete files permanently will come up again at some point. For kernel features though, the goal always tends to be doing a thorough job that, in this case at least, would indeed outwit the government overlords' efforts to recover the data.

There's an ongoing debate about cgroups, between the group of people who want to implement cool features and the group of people who want to ensure security. The security people always win, but the debate is rarely simple, and sometimes a cool feature just needs to be rotated a little in order to match the security requirements.

For example, Aleksa Sarai wanted to add a cool and useful feature limiting the number of open processes allowed in a given virtual machine. This would prevent certain types of denial-of-service attacks. The problem, as pointed out by Tejun Heo, was that an open process limit doesn't correspond to any actual limit on a real system. And, there's a strong reluctance to put limits on anything that's not a true resource, like RAM, disk space, number of CPUs and so on.

On the other hand, as Austin Hemmelgarn said, process IDs (PIDs) were an actual limit on a real system, and Tejun agreed it might make sense to allow them to be limited within a cgroup. And because that could be used to limit the number of open processes, everyone could end up happy. But the feature had to be presented as limiting an actual system resource, rather than limiting a relatively arbitrary characteristic of a system.

The tracing system has been showing signs of outgrowing its infrastructure lately, and Steven Rostedt recently posted some patches to fix that. Up until now, the tracing directory used DebugFS. But as Steven said, tracing needed to create and remove directories, and DebugFS didn't support that. So, tracing had been using various hacks to get around it. Steven's solution was to create a new filesystem called TraceFS, specifically for the tracing system.

There were no major objections, but there were some technical obstacles to get past. In particular, Steven discovered that the perf system was hard-coded to assume that the tracing system used DebugFS, so that had to be fixed before TraceFS could go into the kernel.

Other issues came up; for example, Greg Kroah-Hartman suggested basing TraceFS on KernFS, and Steven considered that for a while. But it turned out that KernFS had a lot of cgroup-related complexity that TraceFS didn't need, and Al Viro remarked, "It's not a good model for anything, other than an anti-hard-drugs poster ('don't shoot that shit, or you might end up hallucinating this')." Ultimately, Steven decided against KernFS.