Introduction to the Xen Virtual Machine
This article is intended mainly for developers who are new to Xen and who want to know more about it. The first two sections, however, are general and do not deal with code.
The Xen VMM (virtual machine monitor) is an open-source project that is being developed in the computer laboratory of the University of Cambridge, UK. It enables us to create many virtual machines, each of which runs an instance of an operating system.
These guest operating systems can be a patched Linux kernel, version 2.4 or 2.6, or a patched NetBSD/FreeBSD kernel. User applications can run on guest OSes as they are, without any change in code. Sun also is working on a Solaris-on-Xen port.
I have been following the Xen project closely for more than a year. My interest in Xen began after I read about it in the OLS (Ottawa Linux Symposium) 2004 proceedings. It increased after hearing an interesting lecture on the subject at a local UNIX group meeting.
Full virtualization has been done with some hardware emulators; one of the popular open-source projects is the Bochs IA-32 Emulator. Another known project is qemu. The disadvantage of hardware emulators is their performance.
The idea behind the Xen Project (para-virtualization) is not new. The performance metrics and the high efficiency it achieves, however, can be seen as a breakthrough. The overhead of running Xen is very small indeed, about 3%.
As was said in the beginning, currently Xen patches the kernel. But, future processors will support virtualization so that the kernel can run on it unpatched. For example, both Intel VT and AMD Pacifica processors will include such support.
In August 2005, XenSource, a commercial company that develops virtualization solutions based on Xen, announced in Intel Developer Forum (IDF) that it has used Intel VT-Enabled Platforms with Xen to virtualize both Linux and Microsoft Windows XP SP2.
Xen with Intel VT or Xen with AMD Pacifica would be competitive with if not superior to other virtualization methods, as well as to native operation.
In the same arena, VMware is a commercial company that develops the ESX server, a virtualization solution not based on Xen. VMware announced in early August 2005 that it will be providing its partners with access to VMware ESX Server source code and interfaces under a new program called VMware Community Source.
A clear advantage of VMware is that it does not require a patch on the guest OS. The VMware solution also enables the guest OS to be Windows. VMware solution is probably slower than Xen, though, because it uses shadow page tables whereas Xen uses both direct and shadow page tables.
Xen already is bundled in some distributions, including Fedora Core 4, Debian and SuSE Professional 9.3, and it will be included in RHEL5. The Fedora Project has RPMs for installing Xen, and other Linux distros have prepared installation packages for Xen as well.
In addition, there is a port of Xen to IA-64. Plus, an interesting Master's Thesis already has been written on the topic, "HPC Virtualization with Xen on Itanium".
Support for other processors is in progress. The Xen team is working on an x86_64 port, while IBM is working on Power5 support.
The Xen Web site has some versions available for download, both the 2.0.* version and the xen-unstable version, also termed xen-3.0-devel. You also can use the Mercurial source code management system to download the latest version.
I installed the xen-3.0-devel, because at the time, the 2.0.* version did not have the AGP support I had needed. This may have changed since my installation. I found the installation process to be quite simple. You should run make world and make install, update the bootloader conf file and that's it--you're ready to boot into Xen. You should follow the instructions in the user manual for best results.
The protection model of the Intel x386 CPU is built from four rings: ring 0 is for the OS and ring 3 is for user applications. Rings 1 and 2 are not used except in rare cases, such as OS/2; see the IA-32 Intel Architecture Software Developer's Manual, Volume 1: Basic Architecture, section 4.5 (privilege levels).
In Xen, a "hypervisor" runs in ring 0, while guest OSes run in ring 1 and applications run in ring 3. The x64/64 is a little different in this respect: both guest kernel and applications run in ring 3 (see Xen 3.0 and the Art of Virtualization, section 4.1 in OLS 2005 proceedings).
Xen itself is called a hypervisor because it operates at a higher privilege level than the supervisor code of the guest operating systems that it hosts.
At boot time, Xen is loaded into memory in ring 0. It starts a patched kernel in ring 1; this is called domain 0. From this domain you can create other domains, destroy them, perform migrations of domains, set parameters to a domain and more. The domains you create also run their kernels in ring 1. User applications run in ring 3. See Figure 1, illustrating the x86 protection rings in Xen.
Currently, domain 0 can be a patched 2.4 or 2.6 Linux kernel. According to the Xen developer mailing list, however, it seems that in the future, domain 0 will support only a 2.6 kernel patch. Much of the work of building domain0 is done in construct_dom0() method, in xen/arch/x86/domain_build.c.
The physical device drivers run only in the privileged domain, domain 0. Xen relies on Linux or another patched OS kernel to provide virtually all of its device support. The advantage of this is it liberates the Xen development team from having to write its own device drivers.
Using Xen on a processor that has a tagged TLB improves performance. A tagged TLB enables attaching address space identifier (ASID) to the TLB entries. With this feature, there is no need to flush the TLB when the processor switches between the hypervisor and the guest OSes, and this reduces the cost of memory operations.
Some manufacturers offer this tagged TLB feature. For example, a document titled "AMD64 Virtualization Codenamed 'Pacifica' Technology Secure Virtual Machine Architecture Reference Manual" was published in May 2005. According to it, this architecture uses a tagged TLB.
Next up is an overview of the Xend and XCS layers. These layers are the management layers that enable users to manage and control both the domains and Xen. Following it is a discussion of the communication mechanism between domains and of virtual devices. The Xen Project source code is quite complex, and I hope this may be a starting point for delving into it.