Some Thoughts on the Occasion of the NSA Linux Release

There are two things I am sure of after all these years: there is a growing societal need for high assurance software, and market forces are never going to provide it. Superficially, I'm going to offer a few comments on the technology underlying the NSA release. My real intent is to induce the Open Source community into building on this release--so when society wakes up to the fact that this stuff

When I say "assurance" I mean the process of acquiring confidence that your box isn't going to do something really ugly if you fire it up or put it on your net. Operational, or black-box assurance, is based on the observation that a certain class of boxes hasn't killed anybody yet, so you're probably safe. Reliance on this form of assurance has led to some pretty nasty surprises. Formal, or glass-box assurance, attempts to provide confidence from some combination of design characteristics, analysis and testing. This approach is still prone to nasty surprises, but they tend to be fewer--or at least easier to explain after the fact. Most high assurance work has been done in the area of kinetic devices and infernal machines that are controlled by stupid robots. As information processing technology becomes more important to society, these concerns spread to areas previously thought inherently harmless, like operating systems. Security is the most obvious example, along with availability of service in chaotic or hostile environments.

The NSA release incorporates an idea called Type Enforcement (TE) that was cooked up by Dick Kain and myself over 15 years ago, as part of a project to investigate high assurance systems. It's intended as a design characteristic to support analysis and testing, in aid of assurance. Our retrospective paper [1] covers those aspects, so I'll concentrate here on the short and long term development work that I think needs to be done.

Type Enforcement Explained

The shortest explanation I ever gave of TE was in response to a question by Butler Lampson: "It's the Lampson Access Matrix organized into equivalence classes for efficiency." The Lampson Access Matrix is a way of modeling the protection state of a system, to deduce implications of a particular policy. It is a two-dimensional matrix, with active entities (subjects, processes, threads) on one axis and passive entities (objects, files, segments) on the other. The entries in a [row, column] intersection define the operations that active entities can perform on passive ones. The matrix is defined in [2], which is one of the classics of computer security and is still worth study.

So much for theory, now for the implementation. Data objects are assigned an attribute called Type and processes an attribute called Domain. Conceptually (but not generally in the implementation) there is an internal matrix, one of whose axes is Type and the other Domain. For each Type,Domain pair, this matrix defines allowed access: read, write, execute for starters.

Two things were left fuzzy at the start and need work: rules for filling out the TE matrix, and the control and mechanization of communication between Domains. The former has been addressed by Dan Sterne and his colleagues in what they call Domain Type Enforcement (DTE), which develops a relationship between the file hierarchy and the matrix. You should understand this work before attempting extensions to the NSA release, just to avoid reinvention. The latter area, inter-Domain interaction, is the one most ripe for innovation.

What To Do With It (Short Term)

First, the one thing not to do with it: attempt to build a general-purpose, resource managing OS that automatically exports some property (safety, security, whatever) to any and all applications that happen to run on it. That's what the whole Orange Book effort was about. Been there, tried that, never got a T-shirt.

Thinking about TE on and off all these years has led me to an more refined definition: TE is a mechanism for integrating an application and its underlying resource manager in a way that reduces the chances of unintended ugliness. It's really a tool for building high assurance, special-purpose boxes. The way to gain immediate benefit from the NSA release is to use it to build armor-plated web servers, mail servers, data base servers, etc. Take the release, strip out every function not needed by the app and bolt the whole thing down using the TE matrix. ("What? No more "Bug in Sendmail allows unauthorized root access?" What's the world coming to?")

To do this you need to recognize ugly when you see it, and you need to know the structure of the application. The first is a policy question, and I can't help you much there. As far as the second is concerned, TE is a data flow control mechanism, and that's what needs to be defined. One structure TE was aimed at from the beginning is what Dick and I dubbed "assured pipelines". This refers to the case where some function (like crypto) absolutely positively has to happen between a predecessor operation (message prep) and a successor (message release). Finding such functions and laying out the pipelines is a necessary first step to defining the Types, Domains and initial form of the TE matrix.