Kernel Korner - Kernel Mode Linux for AMD64

When user code runs inside the kernel, system calls become function calls, 50 times faster. How does that affect the performance of a real application, MySQL?

Kernel Mode Linux (KML) is a technology that enables the execution of user processes in kernel mode. I described the basic concept and the implementation techniques of KML on IA-32 architecture in my previous article, “Kernel Mode Linux”, which appeared in the May 2003 issue of Linux Journal (see the on-line Resources). Since then, I have extended KML to support AMD64, or x86-64, architecture, which is a viable 64-bit extension of the IA-32 architecture. In this article, I briefly describe the background of KML and then show the implementation techniques of KML for the AMD64 architecture. In addition, the results of a performance experiment using MySQL are presented.

The Problem of Protection by Hardware

Traditional OS kernels protect themselves by using the hardware facilities of CPUs. For example, the Linux kernel protects itself using a privilege level mechanism and a memory protection mechanism built in to CPUs. As a result, to use the services of the kernel, such as the filesystem or network, user programs must perform costly and complex hardware operations.

In Linux for AMD64, for example, user programs must use special CPU instructions (SYSCALL/SYSRET) to use kernel services. SYSCALL can be regarded as a special jump instruction whose target address is restricted by the kernel. To utilize system services or, in other words, to invoke system calls, a user program executes the SYSCALL instruction. The CPU then raises its privilege level from user mode to kernel mode and jumps to the target address of SYSCALL, which is specified by the kernel in advance. Then, the code located at the target address switches the context of the CPU from the user context to the kernel context by using the SWAPGS instruction. Finally, it executes the requested system service. To return to the user program, the SYSRET instruction reverses these steps.

Some problems exist, however, in this protection-by-hardware approach. One problem is system calls become slow. For example, on my Opteron system, SYSCALL/SYSRET is about 50 times slower than a mere function call/return.

One obvious solution to speed up system calls is to execute user processes in kernel mode. Then, system calls can be only the usual function calls, because user processes can access the kernel directly. Of course, it is dangerous to let user processes run in kernel mode, because they can access arbitrary portions of the kernel.

One simplistic solution to ensure safety is to use virtual machine (VM) techniques such as VMware and Xen. If user programs and a kernel are executed in virtual kernel mode, user programs can access the kernel directly. However, this protection-by-VM approach does not quite work, because the overhead of virtualization is considerable. In addition, although VM can prevent user programs from destroying the host system outside of the VM, it cannot prevent them from destroying the kernel inside the VM. It is unlikely that these difficulties could be solved even if CPUs, such as Intel's Vanderpool and AMD's Pacifica, provide better support for virtualization.

A recommended way to execute user processes in kernel mode safely is to use safe languages, also known as strongly typed languages. The recent advances in static program analysis, or type theory, can be used to protect the kernel from user processes. For example, many technologies already enable this protection-by-software approach, such as Java bytecode, .NET CLI, Objective Caml, Typed Assembly Language (TAL) and Proof-Carrying Code (PCC). I currently am implementing a TAL variant that is powerful enough to write an operating system kernel.

Based on this idea, I implemented Kernel Mode Linux (KML) for IA-32, a modified Linux kernel that can execute user processes in kernel mode, called kernel-mode user processes. My previous article described KML for IA-32. Since then, I have implemented KML for AMD64, because AMD64 has come into widespread use as a possible successor to IA-32. Interestingly, in spite of the similarities between IA-32 and AMD64, the implementation techniques of KML for these two architectures differ considerably. Therefore, I describe the basic concept, usage and implementation techniques of KML for AMD64 in the rest of this article.

How to Use KML for AMD64

KML is provided as a patch to the source of the original Linux kernel. To use KML, all you have to do is patch the original source of the Linux kernel with the KML patch and enable the Kernel Mode Linux option at the configuration phase, as you might do with other kernel patches. The KML patch is available from the KML site (see Resources).

In current KML, programs under the directory /trusted are executed as kernel-mode user processes. Therefore, if you want to execute bash in kernel mode, all you have to do is execute the following commands:

% cp /bin/bash /trusted/bin
% /trusted/bin/bash

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix