Configuring a Linux/VMware GSX Server
Last time we talked, I described how you could utilize a single high-powered computer running Linux and VMware GSX server to host many virtual servers running Windows NT, 2000, 98, FreeBSD and so on. In this article, we will talk about how to configure a Red Hat Linux server for the VMware GSX environment, add additional network interface cards to reduce virtual server bottlenecks and add an external drive array to provide plenty of disk space for our SQL databases and VMs.
Out of the box, the Linux kernel comes configured to support a great many devices, filesystems and networking protocols. But only a small portion of the supported devices are needed for a typical GSX server, and some that aren't included in the default kernel may need to be added. For some of you, the stock kernel configuration may work fine for your GSX implementation. Depending on your needs or any special hardware requirements, however, you may have to resort to building a custom kernel. For the purposes of this article, we will be using Red Hat 7.2 with the 2.4.2-17smp kernel. If you are using a different distribution for your GSX server (SuSE, Caldera or TurboLinux), make the necessary kernel modifications that match your version.
In order to build our custom kernel, we must run make config from the command line. (If you prefer a GUI version, run make xconfig within a terminal window under X). But first we should make a backup of our default working kernel, and add a pointer to it in LILO or Grub (whichever bootloader you are using). To backup our working kernel, we want to change to the /boot directory and copy the kernel image (vmlinuz.2.4.7-10) and System.map file (system.map.2.4.7-10) either to a backup directory or to renamed files in the current /boot directory (i.e., vmlinux.2.4.7-10-old). Next change to the /etc directory and open your lilo.conf file in vi or your favorite editor, and make a new entry pointing to the backup kernel we just copied or renamed. To save time we can copy the information from the original kernel and create a new instance that points to the backup kernel.
We want to create a means of booting our server back to the stock kernel should our custom kernel behave badly or, worse, fail to boot at all. After saving the modified lilo config file, be sure to update your bootloader to recognize the changes. For example, type lilo and press Enter (do this a few times to be sure your updates are added) at the command prompt to update the LILO bootloader. It's now a good time to reboot the server and try out the backup kernel. Once the kernel successfully boots to the backup kernel, it's time to move on and build our custom kernel.
Now to customize our kernel for a specific GSX environment. Personally I prefer to configure kernels from the command line rather than within an X session, but use the method that best suits you. To start the process, login as root and change to the /usr/src/linux2.4 directory. Type make config to display a list of items the kernel currently supports. Carefully page through the list and only disable support of items you are absolutely sure won't affect the server's ability to function (i.e., sound, infrared, Toshiba laptop, joysticks, Ham radio, etc.). Please note: this step is not necessary to configure the GSX server; I only add it because it makes for a smaller, quicker loading kernel.
Okay, back to our kernel config file. If you make a mistake while paging through the config list, press Ctrl-C to quit without making any changes, then start over. As you look through the file, notice that some options are listed next to each supported device or protocol. Here's what they mean: Y(es) will add support into the kernel itself; N(o) means no support will be provided for this item; M(odular) means the item will be supported as a loadable module. The ?(Help) option is also available. Other options are specific to the functions of the item, such as the maximum memory a server can support.
As mentioned earlier, if you want to optimize the custom kernel, we can trim down the size of the footprint by disabling support for unnecessary devices. But don't get too hung up on disabling everything you don't feel is necessary. We don't want to neuter the kernel and reduce the server to a whimpering mass, only trim it down a bit for better performance. Again this is optional.
Once you are ready to make your changes, run make config again and carefully page through the list, adding or remove support as needed. For this article we want to add an external SCSI drive array to provide additional disk space for virtual servers and SQL databases. To do this, we must add support for the new hardware that will talk to the external disk array (i.e., RAID controller, SCSI controllers, etc.). Also, we need to determine if the new device(s) should load with the kernel or as a module. Keep in mind, if you choose to add a device as part of the kernel rather than as a loadable module, the device support will stay in memory rather than be removed dynamically and added as needed, as is the case with modules. For critical devices, such as RAID controllers, file systems and so on, compiling device support as part of the kernel is necessary. But for less frequently accessed devices, modular support may be a better choice.
To house some very large SQL databases that our virtual servers depend on, we will now add a Compaq SCSI drive cabinet that will provide us with an additional 100GB of disk space. Because we need very fast I/O speed in and out of the external drive array, we will need to install a second RAID controller in our Linux server. Utilizing the same server example (a Compaq DL580, 4GB RAM, RAID 5, etc.) from my previous article, we will now add an additional PCI RAID controller (a SmartRAID 428) and two additional PCI NICs to relieve the network bottleneck. To utilize the second RAID controller, we want to tell the kernel to include Multiple devices driver support in the kernel config file. We will do this by typing a y next to the option. Note: when planning to add additional hardware to a Linux server, it's a good idea to refer to the Linux HCL (Hardware Compatibility List) prior to be sure your new hardware will be supported. You also may try browsing through the kernel config file to see if your hardware is already supported by your kernel version.
All that is left to do now is build our new kernel by issuing the following command (as root):
make mrproper dep clean bzImage modules install
By entering all the make commands on one line, you can walk away or do other things while the kernel compiles, because subsequent commands begin as soon as the previous one is completed. Of course, if you have any errors during the compile routines, you may want to run the commands one at a time in the same order, so that you can determine where the error(s) exist.
Once your kernel is compiled, copy the new bzImage and system.map files to the boot directory. For the new kernel file, change to the /usr/src/linux/arch/i386/boot/ directory and issue this command, cp bzImage /boot/vmlinuz-2.4.7-10. Now do the same with the system.map file; change to the /usr/src/linux directory, and run cp System.map /boot/System.map-2.4.7-10. Now all that is left to do is rerun /sbin/lilo from the command prompt and reboot.
We want to select the new kernel (if it is not the default) and assure that the system starts up normally. If you watch the kernel output on your screen, you should see the new RAID controller identified by the kernel. If you missed it, login as root and type type grep RAID /var/log/dmesg to show you the kernel messages regarding the new RAID device.
Provided we can boot to the new kernel and all is well with the newly added hardware, it's time to set up Linux to work with our new external array and network cards. Set up your RAID controller with the configuration that best suits your needs (i.e., RAID level, caching, etc.), and initialize the disks. For our GSX array, we will choose RAID 5, then initialize the drives and reboot so that we may begin the process of creating the new partition and filesystem.
For partitioning we use the fdisk utility, which is run from the shell prompt as the root user. To begin with fdisk, type the following at the command prompt:
fdisk /dev/<new device>
where new device is the name of the external array as detected by the earlier grep for RAID in dmesg. Next type an m for the command menu. A list of available commands will appear on your screen. From the menu we choose n to create a new partition, then p for a primary. Since we are using the whole array as a single partition, we choose 1 for partition number when prompted. Finally, we choose option l to list partition types and select 83 for Linux.
Once the new partition is written to disk, we are ready to create and format the new filesystem. For our purposes, we chose the ext2 filesystem to house SQL databases on the external drive array. Create an ext2 filesystem from the shell prompt by entering the following command:
mkfs -t ext2 /dev/new_vol
at the command prompt. After the filesystem creation is completed, we want to add the a pointer to it in the /etc/fstab file. Be sure to create a mountpoint and list it in the fstab file. Save the file and close the editor.
Now that we have all this new drive space, we want to point our virtual instances to the new array.
By utilizing multiple NICs, we can eliminate the potential bottleneck of having several Windows instances talking through only one or two network interfaces. What we have done is assign each virtual SQL server it's own NIC due to the potentially high and frequent demand for data. We have allowed the web servers to share NICs since their data demands usually are not as steep.
We will configure Linux to utilize the additional NICs by running netconfig or linuxconf. Enable each new NIC and determine if the IP addresses are to be static or DHCP (servers typically use static IP addresses, though). Next assign the module that the NIC is to use (i.e., eepro100, 3c509, etc.) and any remaining IP information such as DNS and default gateway. Exit the utility and restart the network services with the command service network restart (or /etc/rc.d/init.d/network restart). You should see the new NICs enabled on the screen and can verify they are indeed up by issuing ifconfig at the command prompt. If all of your NICs have come to life, test their connectivity by disabling all but one, one at a time. Once you can ping from each NIC alone, re-enable all the NICs and you should be good to go.
One more thing before we move on. We will want to make sure that the network interface cards are configured to take advantage of your network's configuration (speed, duplex, etc.). As root, change to the /sbin directory and run mii-tool to view the NIC settings. You can refer to the mii-tool --help information for the switches and context if you need to change your NICs settings. One caveat, however: the settings will revert back to defaults when the server is rebooted. To make these changes stick, you will want to script the commands for each NIC, and then, for example, call the script from the rc.local file.
Finally, we will want to make new NIC assignments for the virtual servers. For example, you will probably want the SQL servers to be assigned their own NIC, and you may be able to share a single NIC among less heavily utilized servers. Also, you can assign multiple NICs to a single virtual-server for increased network throughput.
To assign the additional NICs our virtual servers, we will want to launch the remote console from inside an X-window session or from a Windows system. After connecting to the (powered off) virtual Windows instance, we want to open the configuration editor and select the Ethernet option. Next select the network cards you wish to make available to the virtual instance (i.e., eth0 and eth1). Save your changes and restart the instance. All that is left to do now is update the IP information in the virtual instances to reflect the additional network connections.
You should now have a working Linux GSX server with additional disk space and multiple NICs, which will help balance the network load across your virtual instances and network subnets. Be sure to document the changes you have made to the Linux server and virtual instances.
One additional point that I would like to make concerns backups. Besides a regular backup regime of the host server and virtual instances, I use the tar or tape archive command to individually archive the instances themselves. Once I have a completed a working instance, I then shut it down and tar the instance into an archive. This archive is then captured in the daily backups, but it also is copied to another storage server. In the case of an unrecoverable instance failure, I can quickly copy the tar file back to the Linux server and recover the virtual instance in the state it was in at the time of the archiving. With this method, you can also create "vanilla" builds of virtual servers to keep on hand should you need to add another IIS, SQL, Oracle, etc., instance to the network.
I hope you enjoy working with the VMware GSX product and find it as powerful and flexible as I have. With a little time on your part, you can create a server consolidation environment that will save your IT department or business thousands of dollars in hardware and support costs, in addition to simplifying system administration by reducing the number of physical servers.
VMware GSX Server: www.vmware.com
Linux HOWTOs and Linux Kernel:
Red Hat Linux: www.redhat.com
Jeffrey McDonald works as a systems engineer for a California-based Fortune 500 company. He has been working with Linux for the past five years and enjoys promoting Linux.