A Cluster System to Achieve Scalability and High-Availability with Low TCO

The authors describe a commercialized version of the Linux Virtual Server.
Product

The minimum configuration of our product consists of five machines: 1) one central management station (EMS), 2) two load balancers (EDS) and 3) two web servers (EPS). Figure 2 describes one of the four network topologies supported by our product, although the EMS station is not shown in the figure. When a web farm is created, it often must be integrated into an existing overall network structure of a data center, including the assignment of IP addresses. Consequently, the configuration of the web farm should be flexible to accommodate the pre-existing environment. Other possible configurations include one router with either a private or public address on one of the router interfaces.

Figure 2. Our Network Architecture: VIP is a Virtual IP Address Shared by All the Machines in the Cluster

Note that the IP addresses and the MAC addresses shown in the figure are only for illustration. When a client machine (203.116.20.20) sends an HTTP request to the virtual server at its VIP (203.118.100.80), the load balancer intercepts the request. The virtual IP address (shared by all the clustered machines) is configured with the loopback interface for each web server and the eth0 interface for the load balancer. The loopback interface of each web server is configured not to reply to an ARP request for the virtual address. This is why the load balancer is the only one that receives the request packet. Once received, the load balancer decides which web server should service the request based on the preset algorithm ( e.g., round-robin and least connection). Let us assume that the Real Server 1 was chosen. The load balancer replaces the packet's MAC address (that of the load balancer) with the MAC address of Real Server 1 and pushes the packet onto the 192.168.2 subnet. When Real Server 1 receives the packet, it takes the packet for processing because its loopback interface is the destination IP address. The return packet then takes the VIP as the source address and is sent back to the client directly without going through the load balancer.

Moreover, because of the direct-routing option of LVS, the load balancer does not get involved in the transmission of the reply packet from the web server that processed the request. After Real Server 1 receives and processes the packet, the reply packet is sent directly back to the client via router 2. The load balancer acts as a Layer 4 switch, and switching is done in the kernel space rather than in the user space, minimizing the switching overhead. With these two design decisions, the performance should scale almost linearly as more web servers are added.

LVS is available open source. We have added a group of features, called the "operational aid", to LVS to enhance the ease of management and increase availability. We call this enhanced version of LVS a Single Virtual Machine (SVM). SVM makes it possible to manage multiple machines as if they were one single machine. Therefore, SVM = LVS + Coda + operational aid.

Let us take a look at each component of the cluster more closely. The EMS station is used to configure and monitor the cluster. Regardless of the number of machines in the cluster, this single management station provides centralized management for all the machines. The EMS contains an Apache web server, a configuration engine, a monitoring program, LDAP and FTP. The EDS unit has LVS and a web server. The EDS units come as a pair to prevent a single point of failure. The stand-by load balancer will take over if the master load balancer fails. Extra functionality was added to facilitate and provide close communications among EMS, EDS and EPS. In addition, the Coda system was integrated so that it shares the web contents across all the web servers.

In the following, each feature of the operational aid will be discussed.

Configuration/Reconfiguration/Backup

Two initial installation methods are available. One method is to install and configure the system when all the machines are connected and on-line. When a support engineer configures the system on the EMS station, a set of configuration files is generated. Those configuration files are saved to a floppy disk, and the EDS unit reads the configuration files from the floppy disk during the first boot-up. If a new set of configuration files is introduced after the initial boot with the floppy, they are securely copied to the EDS unit via SSH. No matter how the configuration files are copied into the EDS unit, the EDS generates a set of configuration files for the EPS units. Those files in turn are copied to each EPS securely via SSH, and the EPS units configure themselves automatically.

The second method is called pre-configuration. Pre-configuration is performed with only the EMS station. When the configuration is specified using the EMS station, the generated configuration files can be stored on a floppy. This can be done without connecting the EMS station to the rest of the system. With this floppy, pre-installation is possible prior to a visit to the final installation location. The support engineer can run the EMS facility to create a floppy disk containing the configuration files at his/her office. He/she can dispatch a set of equipment to the installation site and use the floppy for the installation. At the installation site, the EDS unit configures itself automatically when booted with the floppy disk. After the EDS unit comes on-line, it generates the necessary configuration files, which are then transmitted to web servers.

______________________