A Cluster System to Achieve Scalability and High-Availability with Low TCO

The authors describe a commercialized version of the Linux Virtual Server.

[This article was accidentally overwritten, and is being reposted. -- Scott]

In this article, a commercial implementation of the Linux Virtual Server (LVS), developed by Wensong Zhang (see Resources), is discussed . The management cost of a web farm is a dominant portion of its total cost of ownership (TCO). In our product, numerous management features were added to the LVS to lower the TCO and to improve fault tolerance. Such additions included automated configuration, automated updating, failure recovery and integration with the Coda filesystem for content replication.

Furthermore, for scalability verification, load testing was conducted by an independent lab that collected performance benchmarks. The results verified our assumption that the system performance increases linearly as more web servers are added to the cluster. In spite of the modest hardware platform used in our version, the system was able to accommodate 400 million hits per day.

Web technology is utilized everywhere, and demands for faster and more reliable web operations have become a critical issue for conducting most business. More specifically, the following features are essential for any web farm.

  • Scalability--It should accommodate an increase in traffic without a major overhaul.

  • 24 x 7 availability--Even a short downtime can cause millions of dollar in damage to e-commerce companies. A web farm should stay operational in spite of hardware and software failure.

  • Manageability--Management of the web farm must be effective, no matter how it was designed and/or configured.

  • Cost-effectiveness--The majority of the TCO is due to the management cost. Therefore, the most effective way to lower the TCO is to reduce the management cost.

A high-performance web farm can be developed in several ways. One could deploy a large single high-powered server. It is straightforward and does not require significant management effort because there is only one machine to manage. This approach, however, has two major problems, cost and scalability. A large high-powered server could cost $1 million. The utilization of such an expensive machine would probably never be at a proper level; it would probably be either underutilized or overutilized. If it is underutilized, the cost justification would be difficult. On the other hand, if it were overutilized, it would be necessary to replace it with yet another server that is even more powerful and expensive. Consequently, this solution does not provide a cost-effective solution with respect to scalability.

Another way to implement a high performance web farm is to use cluster technology. By designing the system with scalability in mind, a scalable yet cost-effective solution can be developed by exploiting inexpensive standard hardware boxes running Linux, known as a Linux Virtual Server (LVS). To date people with Linux and networking expertise have used LVS, and some commercial versions of LVS have been reported (see Resources).

Linux Virtual Server

The LVS architecture consists of one or more load balancers and multiple web servers that are loosely integrated. LVS comes in three options, NAT, tunnel and direct routing. The direct routing option is especially noteworthy because the load balancer does not become a bottleneck. In this architecture, both load balancers and web servers share the same virtual IP address and must be on the same subnet. The virtual IP address is a real IP address given to a single virtual server that may actually consist of one or more machines, as in Figure 1 (courtesy of Zhang) .

Figure 1. Linux Virtual Server: Direct Routing Option Network Topology

The load balancer receives a connection request from a web client, which is simply forwarded to an appropriate web server for processing according to the pre-set policy. A reply packet then goes back to the client without going through the load balancer. Such an architecture is well suited for web traffic because in normal web traffic, a requesting packet is typically short but the reply packet tends to be very long. Also, according to Zhang, when direct routing is used the overhead associated with forwarding packets on the load balancer is minimum, and performance will scale linearly as more servers are added.

Coda Distributed Filesystem

The Coda distributed filesystem is designed as a client/server model and is available from Carnegie Melon University. It has numerous features that are suitable for a distributed filesystem. In our implementation, two of the web servers function as both Coda servers and clients, while the rest of the web servers function as Coda clients only. Two servers are provided to avoid a single point of failure. This choice was made to achieve:

  • Caching and high performance

  • High availability

  • Ease of management

Unlike NFS, Coda allows its clients to cache files so that subsequent access to the referenced files does not require fetching them from the Coda server. This allows quick access to the files without requiring a large storage area on each client. As more files are referenced and stored in the cache, access time to files is improved. Neilson showed that web page requests follow the Zipf distribution; i.e., web page requests tend to be localized, and only a handful of pages are usually referenced. Therefore, those pages often referenced are likely to be in cache, and access to those pages do not require any extra time.

Coda also brings high availability into the cluster. Coda has read/write replication severs that, as a group, provide files to clients and share updates in that group. This insures high availability of data. When one server fails, other servers can transparently take over for the clients.

The Coda filesystem is mounted uniformly among all the clients as /Coda. This contrasts with NSF, which allows any number of remote directories to be mounted. Managing multiple remote directories is cumbersome because all the names and access points must be correctly specified for mounting. Coda's single, uniform name to mount simplifies access to the filesystem for our web servers.