Now, for MinorFs itself. MinorFs currently consists of two user-space filesystems. These filesystems are relatively simple Perl scripts implemented using the FUSE Perl module. Each filesystem has its own distinct task. FUSE (Filesystem in USErspace) is a kernel module that allows nonprivileged users to create their own filesystems.
MinorCapFs is at the core of MinorFs. Some time ago, the Linux directory and file-access API was extended with a set of new calls—openat(), mkdirat() and so on—that take an additional first argument, a file descriptor, which specifies from where relative paths should be resolved (these calls are to be standardized in a future version of POSIX). Given the fact that file handles in Linux can be communicated between processes and used as capabilities, it seemed like a good idea to look at the new directory handle calls and create or extend an LSM module so that directory handles could be passed as directory capabilities. The main goal was to use a directory handle as a capability to a directory that wouldn't disclose anything about parent directories.
After discussing my ideas with the AppArmor people, it was concluded that I should try to do as much as possible in user space, so I started designing MinorCapFs. The goals of MinorCapFs are to allow (unattenuated) decomposition, delegation and composition of subgraphs. MinorFs defines a sparse capability for each directory tree subgraph.
In order for you or your program to decompose the directory graph, each file and directory is given an extended attribute named cap. This extended attribute holds the full MinorCapFs path containing the sparse capability for the directory subgraph. Using any form of interprocess communication at your disposal, this path can be shared with any process or even with other users on the same system. The receiving user or process can create a symbolic link in another directory subgraph—for example, in order to make the delegation permanent.
Figure 1 shows how you could use the attr command to fetch the cap attribute, and how this attribute can be used as a short strong path or sparse capability to a directory or file. Normally, you should not use the command line for this but instead do the same thing from your program code. The getxattr function can be used to do the same thing that the attr command does in the example above.
Composition is almost as important as decomposition. Where the usage of extended attributes for decomposition may be strange and new, composition uses a construct that we probably are all much more comfortable with, the construct of using symbolic links. Next to decomposition, MinorCapFs provides the ability to create symbolic links in the same way that the filesystems we are used to do. Thus, MinorCapFs combines two basic functionalities for doing simple unattenuated decomposition of directory tree graphs and for doing composition of directory graphs from subgraphs.
You could say that MinorCapFs provides the simplest bare-level form of unattenuated capability-based access control. But, what holds the top-level capability? And, how are subgraphs delegated to individual processes? That's where a second filesystem comes in.
As MinorCapFs provides for tree graph decomposition and composition constructs, something has to pass sparse capabilities to processes in order for any process to become able to use MinorCapFs.
To see how we need to solve this, let's take a step back and look at the parallelisms we are trying to exploit. We are trying to make processes into a coarser-grained form of object that, just like objects in any OO language, have private data members. There are two ways to look at the process as such. First, there is the traditional view of nonpersistent processes where all state held by the process disappears when the system reboots or ends for any other reason. You could look at this form of delegation as a better alternative to the troublesome usage of temp directories. Temporary files, by default, would become private to the process until the process delegates them explicitly to other processes.
It is important to note that the temp provision of MinorViewFs is not a reference-counting garbage-collection system. Delegated subgraphs instantly will become invalid at the time the owning nonpersistent process dies.
MinorViewFs delegates subgraphs to individual processes by means of two symbolic links under /mnt/minorfs/priv (Figure 2). Each process reading these symbolic links will have a completely different set of subgraph sparse capabilities delegated to it. The second symbolic link /mnt/minorfs/priv/tmp points to the temporary subgraph described above.