Virtualization the Linux/OSS Way
I'm just a command-line kind of guy. I prefer not to have a GUI running on my servers taking up resources and potentially exposing security issues, and fitting with that mindset of favoring simplicity and economy, I'm also fairly frugal. I don't like to pay more than necessary. I hear peers in the IT field discuss the complexity of their environments and how much they pay for their virtualization solutions, shaking their heads in mock sympathy for each other, while bragging rights go to whomever has the biggest, most complex, most expensive environment. Meanwhile, I listen politely until I eventually chime in, “I pay nothing for my virtualization solution and manage it from the command line. Oh, and did I mention, our startup just started turning a profit.” Then come the blank stares.
Being a firm believer in the benefits of free, open-source software, I prefer software that is either pure FOSS or at least has an open-source version available, even if some premium features are available only in paid versions. I believe that over time, this produces superior software, and I have a safety net so that my fate is not entirely in the hands of one commercial entity. For all of these reasons, I use VirtualBox as the primary virtualization software in my organization. That choice was made before Sun—who bought the creators of VirtualBox, Innotek—was in turn bought by Oracle, who also just bought another virtualization provider and had put years of considerable effort into developing a Xen-based solution, throwing the future of VirtualBox into limbo.
Many of the details on the direction Oracle/Sun will take with virtualization are still up in the air, but things look promising. Xen, while being elegant and providing performance with which other methods of virtualization can't compete, has proven to be a bit of a disappointment in terms of being able to support packaged solutions and keeping up with the features of other solutions. Since the purchase of Sun, there have been a number of 3.0.x maintenance releases of VirtualBox, the beta release (3.1) was formalized into 3.2.0, and there have been two maintenance releases since, getting us to 3.2.4 at the time of this writing. I have it from one of the developers working on the project that at 40,000 downloads a day, VirtualBox is still one of the most popular pieces of software in the Oracle/Sun portfolio. See Resources for a link to Oracle's VirtualBox Blog, where Oracle has officially announced VirtualBox as an Oracle VM product. Indeed, it wouldn't make sense to throw this valuable commodity in the trash, as VirtualBox is much younger yet compares favorably in many respects to the 800-pound gorilla of virtualization, VMware. VirtualBox is performant, reliable and simple to manage. To get all the features in the latest release of VirtualBox, you'd have to shell out some major dough for VMware's top-of-the-line product.
Our two primary VirtualBox hosts have two four-core CPUs per host, 48 gigs of RAM apiece, and ten SATA2 10K RPM drives each, in RAID 0+1, running the x86_64 kernel. Based on initial testing, I expected that we would be able to run about 10–15 virtual hosts each on this hardware, but to my surprise, we currently are at 20 hosts apiece and counting, with headroom. I give each virtual machine one CPU, which makes it stick to one CPU core at a time, and as little memory as possible, increasing those if performance demands it. Using this methodology, we've achieved an environment where if you didn't know the machines were virtual, you couldn't tell these machines weren't on dedicated hardware. The load average on one host rarely tops 1.0 for a 15-minute average and rarely tops 2.0 on the other, so we still have some headroom. I'm now thinking we may be able to run 30 or more machines per host with enough RAM, and possibly more.
I run VirtualBox on Ubuntu Server, which we standardized on for legacy reasons. We currently are on the 3.0.x branch of VirtualBox and are testing upgrades to 3.2. Unfortunately, there still isn't a “non-GUI” package for VirtualBox on Ubuntu, so installing it without also installing Xorg and Qt packages involves the use of the --force-depends option to dpkg (or the equivalent on your system). The OSE (open-source edition) version is available via apt-get in the standard repositories, but I recommend downloading the latest version directly. I'm showing you the install I did of the latest stable version at the time I last upgraded the production systems, but to get all the latest features, like teleportation of virtual machines to other hosts, you'll need to go with the latest 3.2.x version:
loc=http://download.virtualbox.org/virtualbox/3.0.14/ wget $loc/virtualbox-3.0_3.0.14-58977_Ubuntu_karmic_amd64.deb # or appropriate package for your distro/architecture dpkg -i --force-depends \ virtualbox-3.0_3.0.14-58977_Ubuntu_karmic_amd64.deb
dpkg will complain about missing dependencies. You can ignore most of them, but you will need to satisfy the non-GUI dependencies to have full functionality. Subsequent to this, you will find that when you need to install or upgrade packages via the apt utility, it will complain about broken dependencies and refuse to do anything until you resolve the problem. I get around this by taking down all the virtual machines on a host, bringing up the essential ones on another host, uninstalling the virtualbox package, performing my upgrades or installs, and then re-installing. It's an extra step and takes a few minutes, but on your production virtualization hosts, this probably isn't something you will be doing terribly often, as it should have a minimum of required packages to start with:
dpgk -r VirtualBox-3.0.14 apt-get update apt-get upgrade dpkg -i --force-depends \ virtualbox-3.0_3.0.14-58977_Ubuntu_karmic_amd64.deb
Once things are installed, everything is exposed via the command line. In fact, the GUI is only a subset of what is available via the CLI. Currently, configuring port forwarding and the use of the built-in iSCSI initiator are possible only via the CLI (not via the GUI). Try typing this in and pressing Enter for some undocumented goodness that has saved me many hours and headaches:
The usage information available by typing partial commands is exhaustive, and the comprehensive nature of what is available allows for many custom and time-saving scripts. I've scripted all the repetitive things I do, from creating new VMs to bouncing ones that become troublesome after a month or so of uptime. Not only can you shell script, but there also is a Python interface to VirtualBox, and the example script, vboxshell.py, ships with the standard distribution.
A few notes on efficiency and performance—you'll do well to set up a “template” virtual machine installation of your various operating systems that has your environment's configurations for authentication, logging, networking and any other commonalities necessary. You'd also need to throw in your performance enhancements, like for instance, adding divider=10 to the GRUB kernel configuration, resulting in a line like this:
kernel /vmlinuz-2.6.18-164.el5 ro \ root=/dev/VolGroup01/LogVol00 rhgb quiet divider=10
This will require some experimentation in your environment, but most systems are set to a 1,000Hz clock cycle. Even on a host with idle guest systems, the number of context switches that occur simply to check for interrupts can result in high load on the host. This boot-time parameter will divide the clock frequency by ten, reducing the number of context switches by a factor of ten as well, and reducing host load greatly. This might not be suitable for all workload types, but running it on the guests for which it does not produce unacceptable performance will speed up the system overall and most of your guests.
Consider the example script in Listing 1. Like most of the scripts I write, it's quick and dirty, but if you follow this example, you'll have a basic infrastructure in place that you can use to provision and manipulate an almost unlimited number of virtual servers quickly. It takes no variables or input on the command line, supplying the information internally, but it easily could be modified to allow passing parameters via command-line switches. Let's say you need to bring up many virtual machines based on your base disk image. You will give the virtual machines sequential IPs and hostnames. The example script has a few prerequisites you'll have to satisfy. First, the disk image you are starting with must be a fixed size, not dynamically allocated. If you have a dynamic .vdi you want to use, first convert it using clonehd:
VBoxManage clonehd dynfile.vdi statfile.vdi --variant Fixed
Next, you'll need the mount_vdi script (see Resources), which is quite handy in itself, as it mounts the .vdi file as if it were an .iso or raw disk image. And, you'll need to be able to execute it via sudo to create and mount loopback devices. I have edited the mount_vdi.sh script to comment out the lines telling you to type end to unmount and exit the script, and the last few lines of the script that actually do so, moving those functions into the top-level script. You'll need to do likewise for the script in Listing 1 to work. Once you've tested the mount_vdi.sh script, change the path to it in the script in Listing 1 to the appropriate one on your host system.
Assumptions made for the purposes of this particular script include the following: you are running a fairly recent Ubuntu as host and guest with VirtualBox 3.0.x (what I run in production); the .vdis are in the current working directory where the script is located; there are no loop devices (/dev/loopx) on the system already; the root partition is the first one on the virtual disk; the hostname on the base vdi image is basicsys.example.com; it has one Ethernet interface, and the IP address is 220.127.116.11 with a default gateway of 18.104.22.168. The script will be less painful, especially during testing, if you set up your virtualbox user to have password-less sudo, at least temporarily. I've tested the script now on several systems and with slight variations in .vdi age and format, so I'm reasonably confident it will work for a wide variety of environments. Be warned; I have found that VirtualBox 3.2.x, which I am now testing, requires a few changes. Should you run into trouble and solve it, drop me an e-mail at the address in my bio, and let me know, so I can improve the script for future generations.
Listing 1. Script to Create Multiple VMs
#! /bin/bash # A quick and dirty script to create multiple virtual machines, # give them unique hostnames and IP addresses, and culminate in # bringing them on-line. # name of the directory where we'll mount our vdi's dir=temp rootdir=`pwd`/$dir # the basename for the vms basename=vbox-vm- # the file that contains the basic disk image basevdi=base.vdi # how many images are we making number=2 # what subnet will these guests be going on IPnetwork='10.7.7.' gateway='10.7.7.1' # the start of the address range we will use baseIP=10 # amount of memory these guests will get in Mbytes memory=512 # base VRDP port baseRDP=16001 counter=1 while [ $counter -le $number ] do echo $basename$counter $basename$counter.vdi \ $IPnetwork$baseIP $memory VBoxManage clonehd `pwd`/base.vdi \ `pwd`/$basename$counter.vdi --variant Fixed sudo mount_vdi/mount_vdi.sh $basename$counter.vdi $rootdir 1 sudo sed -i "s/basicsys/$basename$counter/g" $rootdir/etc/hosts sudo sed -i "s/basicsys/$basename$counter/g" $rootdir/etc/hostname sudo sed -i "s/22.214.171.124/$gateway/g" $rootdir/etc/network/interfaces sudo sed -i "s/126.96.36.199/$IPnetwork$baseIP/g" \ $rootdir/etc/network/interfaces sudo rm $rootdir/etc/udev/rules.d/70-persistent-net.rules sudo touch $rootdir/etc/udev/rules.d/70-persistent-net.rules sudo umount $rootdir sudo losetup -d /dev/loop1 sudo losetup -d /dev/loop0 VBoxManage createvm --name $basename$counter --register VBoxManage modifyvm $basename$counter --pae on --hwvirtex on VBoxManage modifyvm $basename$counter --memory $memory --acpi on VBoxManage modifyvm $basename$counter \ --hda `pwd`/$basename$counter.vdi VBoxManage modifyvm $basename$counter \ --nic1 bridged --nictype1 82540EM --bridgeadapter1 eth0 VBoxHeadless --startvm $basename$counter -p $baseRDP & sleep 5 baseRDP=$((baseRDP + 1)) baseIP=$((baseIP + 1)) counter=$((counter + 1)) done
If you get this script working, you are well on your way to having the infrastructure in place to support a manageable, flexible, cost-effective, robust virtualization environment. Personally, I'm looking forward to getting 3.2.x in place and being able to teleport running machines between hosts to manage workloads in real time—from the command line, of course. Stay tuned, my next article will deal with the back-end shared storage (based on open protocols and free, open-source software, while being redundant and performant). I intend to connect my virtualization hosts to support being able to:
VBoxManage controlvm vbox-vm-3 \ teleport --host vbox-host-2 --port 17001