Making Root Unprivileged

Mitigate the damage of setuid root exploits on your system by removing root's privilege.

There was a time when, to use a computer, you merely turned it on and were greeted by a command prompt. Nowadays, most operating systems offer a security model with multiple users. Typically, the credentials you present at login determine the amount of privilege that programs acting upon your behalf will have. Everyday tasks can be accomplished using unprivileged userids, minimizing the risks due to user error, accidental execution of malware downloaded from the Internet and so on. Any program needing to exercise privilege must be executed using a privileged userid. In UNIX, that is userid 0, the root user. Unfortunately, this means any software needing even just a bit of privilege can lead to a complete system compromise should it misbehave or be attacked successfully.

POSIX capabilities address this problem in two ways. First, they break the notion of all-or-nothing privilege into a set of semantically distinct privileges. This can limit the amount of privilege a task may have, so that, for example, it only is able to create devices or trace another users' tasks.

Second, the notion of privilege is separated from userids. Instead, privilege is wielded by the files (programs) a process executes. After all, users sitting at the keyboard just invoke programs to run on their behalf. It is the programs that actually do something. And, it is the programs that an administrator may entrust with privilege, based on knowledge of what the program does, who wrote it and who installed it.

POSIX capabilities have been implemented in Linux for years, but until recently, Linux supported only process capabilities. Because files are supposed to wield privilege, the lack of file capabilities meant that Linux required workarounds to allow administrators and system services to run with privilege. The POSIX capability rules were perverted to emulate a privileged root user.

Although file capabilities are now supported, the privileged root user remains the norm. In this article, I demonstrate a lazy prototype of a system with an unprivileged root.

POSIX Capabilities Overview

Each process has three capability sets:

  • The effective set (pE) contains the capabilities it can use right now.

  • The permitted set (pP) contains those that it can add back into its effective set.

  • The inheritable set (pI) is used to determine its new sets when it executes a new file.

Files also have effective (fE), permitted (pP) and inheritable (fI) sets used to calculate the new capability sets of a process executing it.

At any time, a process can use cap_set_proc() to remove capabilities from any of the three sets.

Capabilities can be added to the effective set only if they are currently in its permitted set. They never can be added to the permitted set. And, they can be added to the inheritable set only if they are in the permitted set or if CAP_SETPCAP is in pE. When a process executes a new file, its new capability sets are calculated as follows:

  • The inheritable set remains unchanged.

  • The new permitted set is filled with both the file permitted set (masked with a bounding set, but for this article, I assume that always is full) and any capabilities present in both the file and process inheritable sets.

  • Capabilities in the file permitted set will be available to the new process—an example use for this is the ping program. Ping needs only the capability CAP_NET_RAW in order to craft raw network packets. It is typically setuid-root, so all users can run it. By placing CAP_NET_RAW in ping's permitted set, all users will receive CAP_NET_RAW while running ping.

  • Capabilities in the file inheritable set are available to a process only if they also are in the process inheritable set—an example of this would be to allow some users to renice other user's tasks. Simply arrange (as I explain a bit) for CAP_SYS_NICE to be placed in their pI on login, as well as in fI for /usr/bin/renice. Now, ordinary users can run renice without privilege, and the “special” users can run renice, but no other programs, with CAP_SYS_NICE.

  • The effective set is by default empty, or, if the legacy bit is set (see The File Effective Set, aka the Legacy Bit sidebar), it is set to the new permitted set.