High Availability Linux Web Servers

If a web server goes down, here's one way to save time and minimize traffic loss by configuring multiple hosts to serve the same IP address.

Imagine yourself as the System Administrator for a fairly large web site. It's 5:00 AM on a Monday morning. You're awakened by a page from Big Brother. One of three web servers has just dropped off the network. Suddenly, a third of your traffic is going unanswered. What can you do? The commute to the office isn't a short one, and by the time you get there, you'll already have dropped thousands of hits, which could mean lost revenue, decreased productivity or a missed deadline. Whatever the case may be, someone is going to be affected. As you begin your journey into work, you wonder how this problem could have been prevented.

In fact, a number of solutions are available, many of which require expensive hardware or software. This article outlines a simple and effective method of achieving the same functionality in a cost-effective manner. This method uses a router and the loopback interfaces of your Linux web servers. We achieve high availability by configuring multiple hosts to be capable of serving traffic for the same IP addresses at any given time. Conventionally, we think of virtual IP addresses as being assigned to Ethernet interfaces. However, no two Ethernet interfaces can share the same IP address. We're able to assign the same IP addresses to multiple hosts by binding them to loopback interfaces instead. For instance, a SYN packet, destined for one of these loopback interfaces, travels across the wire to a router that decides the next packet hop based on its routing table. The packet is then forwarded to the next hop—the Ethernet interface on one of many redundant web servers. Then, the packet is forwarded from the Ethernet interface to one of the configured loopbacks on the system. An ACK (acknowledgement) will travel along the same path in reverse. The packet originates on the loopback interface, is forwarded to the Ethernet interface, then back to the router to be sent on its journey back to the original host that sent the SYN packet. Again, the beauty of this scheme is the ability to configure multiple hosts with the same IP address bound to loopback interfaces. By doing so, we've enabled ourselves to redirect traffic for a particular IP address or even an entire subnet by simply changing a route in that last hop router. This saves time and minimizes traffic loss. The process can even be automated using simple shell scripts.

Configuring the Linux Kernel

The kernel must be configured to support IP aliasing. IP aliasing is the process of binding multiple IP addresses to a given network interface, thus creating “virtual” interfaces. Under Linux, interface names are assigned linearly. For example, the first loopback interface is called lo, the second lo:1, the third lo:2 and so on. You can see which interfaces are configured on your system by typing:

/sbin/ifconfig

Configure the kernel with support for TCP/IP, network aliasing and IP aliasing. Under Linux 2.0.x, this is accomplished by answering “yes” to the following kernel configuration options:

Network aliasing (CONFIG_NET_ALIAS) [Y/n/?] y
TCP/IP networking (CONFIG_INET) [Y/n/?] y
IP: aliasing support (CONFIG_IP_ALIAS) [Y/m/n/?] y

Our Network

Our fictitious network will consist of four machines, although you could support the same functionality with as few as two boxes or as many as you anticipate needing. Four boxes will allow us to serve a hefty amount of traffic and still allow plenty of room for growth. Having all four machines handling traffic for a single web site will provide some load balancing as well, using “round robin” DNS. If you ever exceed the capacity of your web servers, adding additional machines is a simple task.

We'll take the class C address 192.168.1.0 and apply a 27-bit subnet mask which will yield 8 subnets and 240 usable hosts.

Note that according to the RFC, the upper and lower subnets will not be usable. Some operating systems will not allow you to configure an interface using an address that falls into one of these subnets. Some routers require you to enable this feature implicitly. For example, Cisco requires that the router be configured with the command ip subnet-zero. This is implementation-dependent, although I have yet to see a UNIX or Microsoft-based host that had a problem utilizing all subnets. If you are unable to use all eight subnets or you are an RFC compliancy fanatic, this configuration will yield 6 subnets and 180 unique hosts.

Traffic can be spread across our 4 hosts for up to 30 different web servers quite easily. It also leaves us with four free subnets for future expansion. Using subnets allows traffic to be redirected from one machine to another with a few simple commands. However, your requirements may not call for an implementation as large as the one in our example. The same functionality can be achieved using host routes, so instead of the routing table having an entry for an entire subnet, the entry is for a single IP address using a 32-bit subnet mask. I'll try to explain the differences where applicable.

While here, we can use a subnet for the Ethernet interfaces of our web servers from our class C; namely, 192.168.1.1 for our router and 192.168.1.2, 192.168.1.3 and 192.168.1.4 for our web servers. Under Red Hat, this is done by editing the /etc/sysconfig/network-scripts/eth0 file to look something like this:

DEVICE=eth0
IPADDR=192.168.1.2
NETMASK=255.255.255.224
NETWORK=192.168.1.0
BROADCAST=192.168.1.31
ONBOOT=yes

You'll also want to edit the /etc/sysconfig/network file to configure the appropriate default route. Mine looks like this:

NETWORKING=yes
HOSTNAME=foohost.foo.com
DOMAINNAME=foo.com
GATEWAY=192.168.1.1
GATEWAYDEV=eth0
Interface configuration varies from distribution to distribution, so your mileage may vary.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState