Virtualization in Xen 3.0

Dive into the new Xen release and find out what it offers for paravirtualization, split drivers and Intel's new virtualization technology.

Editor's Note: This article has been updated since its original posting.

Virtualization has existed for over 40 years. Back in the 1960s, IBM developed virtualization support on a mainframe. Since then, many virtualization projects have become available for UNIX/Linux and other operating systems, including VMware, FreeBSD Jail, coLinux, Microsoft's Virtual PC and Solaris's Containers and Zones.

The problem with these virtualization solutions is low performance. The Xen Project, however, offers impressive performance results--close to native--and this is one of its key advantages. Another impressive feature is live migration, which I discussed in a previous article. After much anticipation, Version 3.0 of Xen recently was released, and it is the focus of this article.

The main goal of Xen is achieving better utilization of computer resources and server consolidation by way of paravairtualization and virtual devices. Here, we discuss how Xen 3.0 implements these ideas. We also investigate the new VT-x/VT-i processors from Intel, which have built-in support for virtualization, and their integration into Xen.


The idea behind Xen is to run guest operating systems not in ring 0, but in a higher and less privileged ring. Running guest OSes in a ring higher than 0 is called "ring deprivileging". The default Xen installation on x86 runs guest OSes in ring 1, termed Current Privilege Level 1 (or CPL 1) of the processor. It runs a virtual machine monitor (VMM), the "hypervisor", in CPL 0. The applications run in ring 4 without any modification.

About 250 instructions are contained in the IA-32 instruction set, of which 17 are problematic in terms of running them in ring 1. These instructions can be problematic in two senses. First, running the instruction in ring 1 can cause a general protection exception (GPE), which also may be called a general protection fault (GPF). For example, running HLT immediately causes a GPF. Some instructions, such as CLI and STI, may can cause a GPF if a certain condition is met. That is, a GPF occurs if the CPL is greater than the IOPL of the current program or procedure and, as a result, has less privilege.

The second problem occurs with instructions that do not cause a GPF but still fail. Many Xen articles use the term "fail silently" to describe thess cases. For example, the POPF at the restored EFLAGS has a different interrupt flag (IF) value than the current EFLAGS.

How does Xen handles these problematic instructions? In some cases, such as the HLT instruction, the instruction in ring 1--where the guest OSes run--is replaced by a hypercall. For example, consider sparse/arch/xen/i386/kernel/process.c in the cpu_idle() method. Instead of calling the HLT instruction, as is done eventually in the Linux kernel, we call the xen_idle() method. It performs a hypercall instead, namely, the HYPERVISOR_sched_op(SCHEDOP_block, 0) hypercall.

A hypercall is Xen's analog to a Linux system call. A system call is an interrupt (0x80) called in order to move from user space (CPL3) to kernel space (CPL0). A hypercall also is an interrupt (0x82). It passes control from ring 1, where the guest domains run, to ring 0, where Xen runs. The implementation of a system call and a hypercall is quite similar. Both pass the number of the syscall/hypercall in the eax register. Passing other parameters is done in the same way. In addition, both the system call table and the hypercall table are defined in the same file, entry.S.

You can batch some hypercalls into one multicall by building an array of hypercalls. You can do this by using a multicall_entry_t struct. You then can use one hypercall, HYPERVISOR_multicall. This way, the number of entries to and exits from the hypervisor is reduced. Of course, reducing such interprivilege transitions when possible results in better performance. The netback virtual drivers, for example, uses this multicall mechanism.

Here's another example: the CLTS instruction clears the task switch (TS) flag in CR0. This instruction causes a GPF, however, when issued in ring 1, as is the case with HLT. But the CLTS instruction itself is not replaced by some hypercall. Instead, it is delegated to ring 0 in the following way. When it is issued in ring 1, we get a GPF. But this GPF is handled by do_general_protection(), located in xen/arch/x86/traps.c. Note, though, that do_general_protection() is the hypervisor handler, which runs in ring 0. From there, do_general_protection() calls do_fpu_taskswitch(). Under certain circumstances, this handler scans the opcode of the instructions received in the CPU. In the case of CLTS, where the opcode is 0x06, it calls do_fpu_taskswitch(0). Eventually, do_fpu_taskswitch(0) calls the CLTS instruction, but this time it is called from ring 0. Note: be sure _VCPUF_fpu_dirtied is set to enable this.

Those who are curious about further details can look at the emulate_privileged_op() method in that same file, xen/arch/x86/traps.c. The instructions that may "fail silently" usually are replaced by others.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Novell virtualization information page

Nick Page's picture

Understand me now!

Novell offers various networking and virtualization solutions including 'SUSE linux enterprise' which has the added benefit of being able to support numerous operating systems such as Linux, Netware and Windows in unison (by sharing the same physical servers) due to Novell's collabiration with Microsoft. Users are therefore provided with the best virtualization platform for Windows server consolidation. Novells virtualization software also includes an integrated suite of tools for virtualization management and automation.

Here is a link to the Novell virtualization information page ( using the link text virtualization or novell virtualization. I strongly believe that you readers will benefit from the networking and virtualization information and support offered by our website.

I think this Nick Page guy

David McGloin's picture

I think this Nick Page guy is right, I was just thinking the same myself. I checked out Novell's site and its filled with quality info. I love open source!

As per the comment on

Anonymous's picture

As per the comment on FreeBSD Jail, Solaris Zones have a very low overhead usually <1%.


Anonymous's picture

There is a typo in the first paragraph under "paravirtualization":

"The applications run in ring 4 without any modification."

I believe that should be "ring 3."

FreeBSD Jails have _no_

Anonymous's picture

FreeBSD Jails have no performance impact! It's simply another technique with other uses.

Have you ever tried OpenVZ

Anonymous's picture

Have you ever tried OpenVZ project?
It is much easier to use and allows to run more Virtual Servers than Xen.

Easier, maybe, but if performance matters

JohanBV's picture

Perhaps it's easier for home usage or simple installs for your own infrastructure. If you simply need a hosted and installed OS on a good connection, you should look for a VPS. My finding was that OpenVZ servers I've rented were much slower that those from Xen providers. I recommend's Xen offerings


I was hoping to see more on alternative operating systems

Ken Yee's picture

Since the VT and Pacifica support was supposed to be the enabler for being able to load WinXP, etc. and run it inside Xen.

The Hypervisor really needs to be integrated into the Linux kernel's too much of a pain to keep patching kernels as they're released...

I agree Xen can be hard to

mangoo's picture

I agree Xen can be hard to set up manually.

On the other hand, kernel and other needed binaries are often shipped with most major distros.

Thanks for useful article

Dobrica Pavlinusic's picture

I was wondering about Xen support on AMD, and this article was very useful. Keep up the good work.