A Cluster System to Achieve Scalability and High-Availability with Low TCO

The authors describe a commercialized version of the Linux Virtual Server.
Read-Only for the Web Server

On each EPS unit the web contents are read-only, even for the root user. Another level of security is incorporated to protect accidental modification or deletion of the web contents: the user must be authenticated for write access to the web contents stored in the Coda filesystem.

Performance and High Availability Testing

This section discusses the performance and high availability of our system.

We would like to verify 1) the scalability and 2) the high availability of our system. Systems similar to ours have been developed based on the LVS but to-date very few performance results have been reported, with the exception of "Linux on Carrier Grade Web Servers".

According to Zhang, the load balancer's overhead is minimal because its forwarding function is executed in the kernel space rather than the user space. Furthermore, by adopting the LVS direct-routing option, the load balancer cannot become a bottleneck because it does not get involved in returning the result. Based on this design, the performance should scale almost linearly as more web servers are added

One factor that may contribute to performance is the Coda filesystem. The web servers come in two flavors, Coda server and Coda client or Coda client only. When a request is forwarded to the first type, any necessary web contents are available on that server. On the other hand, if a request is forwarded to the second type and if the request page is in the cache, the request could be honored immediately. Otherwise, a page would have to be fetched from the Coda server, adding some delay. Initially, when the cache is not filled with pages that are already referenced, the Coda client would need to send many requests to the server for pages. After some processing, however, the cache-hit ratio will increase, and the delay due to the page fetch would become negligible.

Testing Methods

eTesting Labs was hired to conduct performance testing. The results are presented in terms of requests per second and throughput. The network topology used is shown in Figure 4. At eTesting Labs, a collection of PCs with WebBench generated web page requests. When a PC made a page request, the client on the PC did not issue another request until the results of the current request were returned. Rather than displaying the results, each round-trip time was recorded. Multiple PCs were necessary because each PC could generate only so many requests per second. The product was stress-tested by increasing the number of WebBench clients from 1 through 120. In addition, the testing used two types of page requests, static page and e-commerce page. A static page request has no dynamic content, while the e-commerce page requests have CGI scripts and SSL turned on. Today it is very rare to see web sites with only static pages. However, the static-page benchmark is useful as a baseline.

Figure 4 shows that the initial test configuration consisted of one EMS unit, two EDS units and two EPS units. EPS units were added incrementally as the test progressed. For the initial configuration with two EPSes, each EPS served as both a Coda server and a Coda client. In the other test configurations (using three, four, six, and 10 EPSes, respectively) two of the EPSes served as both a Coda server and a Coda client; and the other EPSes served as only a Coda client.

Figure 4. Test Network Topology at eTesting Lab

Testing Results

The results of e-commerce testing showed perfect or near linear scaling . When the number of EPS units was increased from two to three, 1.5 times more requests were processed per second. When the number of EPS units was doubled from two to four, there were 1.99 times more requests processed per second. The performance improvement was 4.65 times when there were 10 EPS units rather than two. Considering the additional overhead, these results are good. With 10 EPS units, nearly 4,600 requests per second were serviced at 18 million bytes per second. At a rate of 4,600 requests per second, more than 397 million requests could be serviced in 24 hours.

Figure 5 and Figure 6 show the e-commerce testing results in terms of requests per second and throughput, respectively.

Figure 5: Request Per Second for E-commerce Testing

Figure 6: Throughput for E-commerce Testing

Both fail-over tests for EDS and EPS units worked as expected. First, the master load balancer was turned off. As a result, the stand-by load balancer took over and no service interruption was observed. When the master was again turned on, it resumed the load balancing function without any interruption. When one of the EPS units was turned off, it was removed from the available server pool but there was no service interruption. When that EPS unit was again turned on, it sent a signal to the EDS to put itself back in the available server pool, resulting in no visible service interruption.

With up to 10 web servers, there was no significant performance degradation. However, dedicated Coda servers would probably be required for a much higher load. Otherwise, access to the Coda would become a bottleneck.

Although we added a number of improvements over LVS, there are a few areas we would like to improve. They are:

  • Load balancer bottleneck

  • Stateful fail-over

  • Lost packet

  • Integration with backend servers

  • Load balancing other servers