High Availability Cluster Checklist
One of the greatest benefits of a high-availability cluster, which is ironically overlooked, is the ability to cleanly migrate services off a cluster member so you can perform routine maintenance without disrupting service to client systems. For example, this allows you to upgrade your software to the latest release or add memory to your system while keeping your site operational. Virtually all high-availability cluster offerings accommodate planned maintenance.
If you believe that a particular operating system is crash proof, give me a call and I'll sell you the Brooklyn Bridge to go along with that OS. Let's face it, system crashes are facts of life; it is merely a matter of minimizing their frequency. In response to a system crash, the other cluster members will conclude that a server has become nonresponsive and commence a take over of the services formerly provided by the failed node.
In the event of a system crash, virtually all fail-over cluster implementations will correctly takeover the services of a failed node. So far so good—it looks like just about any fail-over cluster product will suit you. Not so fast; the following points separate the credible offerings from the not so credible.
Typical high-availability cluster implementations consist of a set of cluster members, each monitoring the other's health over a variety of “cluster interconnects”. Historically, many proprietary cluster vendors have depended on custom hardware for their cluster interconnects. While this provides a solid cluster implementation, by nature it tends to be very expensive and locks you into a single vendor. To provide a cost-effective alternative, other cluster implementations monitor system health over commonly available network interconnects (commonly Ethernet) and serial port connections. In these configurations, the cluster members periodically exchange messages, and based on the response (or lack thereof) conclude whether the other members are up or down. This exchange of system health-monitoring messages is commonly referred to as a “heartbeat”.
A common problem with “heartbeat” based clusters is communication partitions. This is when cluster members (or a set of members) are up but are unable to communicate with one another. Take, for example, the diagram in Figure 2 depicting a two-node cluster with an Ethernet and Serial connection between the nodes over which heartbeat messages are exchanged.
Let us suppose you had set up your high-availability cluster and gone off to Las Vegas for the weekend, lulled into complacency with your company's new on-line ordering system deployed in this configuration. Further imagine the cleaning person accidentally knocking out the Ethernet connection with a broom. Now your two cluster members' cluster software running on each node must decide how to respond to this scenario in the interest of preserving high availability. Since the members can't communicate, they have to make the call in isolation. Here's some policy options commonly used by some cluster products:
Pessimistic assumption—Node A knows that it's serving the database but is unaware of node B's state, so node A continues to serve the database. Node B can't communicate with node A and assumes that node A is down. Node B then commences serving the database resulting in two cluster members serving the same database further resulting in database corruption and possibly a system crash. (As weak as this sounds, this policy is employed in some offerings!)
Optimistic assumption—After a site wide power outage, node A and node B both boot up at the same time. Neither node can ascertain the state of the other node and, just to be safe, they each assume that the other node is up so they do not start serving the database (to avoid data corruption). This results in a scenario where neither cluster member is serving the database. So much for spending money for a redundant cluster server! Actually, you're better off having your database unavailable than to have it corrupted. There are other failure scenarios that manifest themselves as a communication failure. For example:
An Ethernet adapter fails
The systems are connected to a common hub or switch that fails
The Ethernet cable fails
To avoid these forms of communication partition, a common clustering practice is to employ multiple communication interconnects. For example, you may have the systems monitor each other's health by heartbeating over multiple Ethernets or a combination of both Ethernet and serial connections. Similarly, you may have each of the network connections go through separate hubs/switches or be point-to-point links.
Most cluster implementations allow you to configure multiple communication interconnects to eliminate the possibility of a communication partition. (If they do not, you should probably quickly move on to another vendor.)
- My Childhood in a Cigar Box
- Papa's Got a Brand New NAS
- Applied Expert Systems, Inc.'s CleverView for TCP/IP on Linux
- Panther MPC, Inc.'s Panther Alpha
- Rogue Wave Software's TotalView for HPC and CodeDynamics
- Simplenote, Simply Awesome!
- Debugging Democracy
- Returning Values from Bash Functions
- NethServer: Linux without All That Linux Stuff
- Tech Tip: Really Simple HTTP Server with Python