New Intel Caching Feature Considered for Mainline
These days, Intel's name is Mud in various circles because of the Spectre/Meltdown CPU flaws and other similar hardware issues that seem to be emerging as well. But, there was a recent discussion between some Intel folks and the kernel folks that was not related to those things. Some thrust-and-parry still was going on between kernel person and company person, but it seemed more to do with trying to get past marketing speak, than at wrestling over what Intel is doing to fix its longstanding hardware flaws.
Reinette Chatre of Intel posted a patch for a new chip feature called Cache Allocation Technology (CAT), which "enables a user to specify the amount of cache space into which an application can fill". Among other things, Reinette offered the disclaimer, "The cache pseudo-locking approach relies on generation-specific behavior of processors. It may provide benefits on certain processor generations, but is not guaranteed to be supported in the future."
Thomas Gleixner thought Intel's work looked very interesting and in general very useful, but he asked, "are you saying that the CAT mechanism might change radically in the future [that is, in future CPU chip designs] so that access to cached data in an allocated area which does not belong to the current executing context wont work anymore?"
Reinette replied, "Cache Pseudo-Locking is a model-specific feature so there may be some variation in if, or to what extent, current and future devices can support Cache Pseudo-Locking. CAT remains architectural."
Thomas replied, "that does NOT answer my question at all."
At this point, Gavin Hindman of Intel joined the discussion, saying:
Support in a current generation of a product line doesn't imply support in a future generation. Certainly we'll make every effort to carry support forward, and would adjust to any changes in CAT support, but we can't account for unforeseen future architectural changes that might block pseudo-locking use-cases on top of CAT.
And Thomas replied, "that's the real problem. We add something that gives us some form of isolation, but we don't know whether next generation CPUs will work. From a maintainability and usefulness POV that's not a really great prospect."
Elsewhere in a parallel part of the discussion, Thomas asked, "Are there real world use cases that actually can benefit from this [CAT feature] and what are those applications supposed to do once the feature breaks with future generations of processors?"
Reinette replied, "This feature is model-specific with a few platforms supporting it at this time. Only platforms known to support Cache Pseudo-Locking will expose its resctrl interface."
To which Thomas said, "you deliberately avoided to answer my question again."
Gavin replied now, saying:
Reinette's not trying to avoid the questions, we just don't necessarily have definitive answers at this time. Currently pseudo-locking requires manual setup on the part of the integrator, so there will not be any invisible breakage when trying to port software expecting pseudo-locking to new devices, and we'll certainly do everything we can to minimize user-space/configuration impact on migration if things change going forward, but these are unknowns. We are in a bit of chicken/egg where people aren't broadly using it because it's not architectural, and it's not architectural because people aren't broadly using it. We could publicly carry the patches out of mainline, but our intent for pushing the patches to mainline are to a) increase exposure/usage b) reduce divergence across people already using hacked versions, and c) ease the overhead in keep patches in sync with the larger CAT infrastructure as it evolves - we are clear on the potential support burden being incurred by submitting a non-architectural feature, and there's certainly no intent to dump a science-experiment into mainline.
Thomas replied, "Ok. So what you are saying is that 'official' support should broaden the user base, which in turn might push it into the architectural realm. I'll go through the patch set with this in mind."
Elsewhere, Thomas and Reinette went through a more technical exchange of data, and Reinette provided useful data points for understanding the value of the CAT feature itself. To all of this, Thomas said, "Very nice. Thank you so much for doing this. That kind of data is really valuable. My take away from this: All of the mechanisms are only delivering best effort and the real benefit is the reduction of average latency. The worst case outliers are in the same ballpark at seems." And, he promised to review the next version of Intel's patch, which Reinette expected to send out within the week.
So as Intel tries to move past Spectre/Meltdown, it continues to collaborate with kernel developers on this sort of feature. At the same time, it's hard to forget that its hardware problems are not over, and that new CPU flaws continue to be discovered even now. Linus Torvalds has interpreted some of Intel's statements to mean that Intel does not intend to fix some of its hardware flaws in future generations of CPUs, which would force kernel developers, and developers of other operating systems, to work around those flaws for the foreseeable future. So there's a lot of tension even in the context of collaborating on relatively simple new features like CAT.
Note: if you're mentioned above and want to post a response above the comment section, send a message with your response text to firstname.lastname@example.org.