A High-Availability Cluster for Linux

Mr. Lewis tells us how he designed and implemented a simple high-availability solution for his company.
Resynchronization of Files on Node Recovery

A major design factor is resynchronization (mirroring back) of the files once a failed node has recovered. A reliable procedure must be employed so that data which has changed on the failover node during the failure period is mirrored back to the original node and not lost, because the original node overwrites or deletes it in the restoration procedure. The resynchronization procedure should be implemented so that a node cannot perform any mirroring while another node has taken over its services. Also, before the services can be restarted on the original node, all files associated with it must be completely mirrored back to this original node. This must be done while the services are off-line on both nodes to prevent the services from writing to the files being restored. Failure to prevent this could result in data corruption and loss.

Mirroring Warnings

The main problem when using this solution was with IMAP4 and pop3 mail spools. If an e-mail message is received and delivered on serv2, and serv2 fails before mirroring can take place, serv1 will take over the mail services. Subsequent mail messages would arrive in serv1's mail spool. When serv2 recovers, any e-mail received just before failure will be overwritten by the new mail received on serv1. The best way to solve this is to configure Sendmail to queue a copy of its mail for delivery to the takeover node. In the event that the takeover node is off-line, mail would remain in the Sendmail queue. Once the failed node recovered, e-mail messages would be successfully delivered. This method requires no mirroring of the mail spools and queues. However, it would be necessary to have two Sendmail configurations available on both nodes: one configuration for normal operation and the other for node takeover operation. This will prevent mail from bouncing between the two servers.

I am not a Sendmail expert. If you know how to configure dual-queuing Sendmail delivery, please let me know. This part is still a work in progress. As a temporary measure, I create backup files on resynchronization of the mail spool with manual checking on node recovery, which is quite time consuming. I also prevent such difficulties by mirroring the mail spool as frequently as possible. This has an unfortunate temporary side effect of making my hard disks work overtime. Similar problems would be encountered when clustering a database service. However, a few large UNIX database vendors are now providing parallel versions of their products, which enable concurrent operation across several nodes in a cluster.

The Node Recovery Procedure

A node could fail for various reasons ranging from an operating system crash, which would result in a hang or reboot, to a hardware failure, which could result in the node going into standby mode. If the system is in standby mode, it will not automatically recover. The administrator must manually remove a standby lock file and start run-level 5 on the failed node to confirm to the rest of the cluster that the problem has been resolved. If the OS hangs, this would have the same effect as a standby run level; however, if the reset button is pressed or the system reboots, the node will try to rejoin the cluster, as no standby lock file will exist. When a node attempts to rejoin the cluster, the other node will detect the recovery and stop all cluster services while the resynchronization of the disks takes place. Once this has completed, the cluster services will be restarted and the cluster will once again be in full operation.

Implementation Platform

My choice of Linux distribution is Red Hat 5.1 on the Intel platform. There are, however, no reasons why this could not be adapted for another Linux distribution. The implementation is purely in user space. No special drivers are required. Some basic prerequisites are necessary in order to effectively deploy this system:

  • Two similarly equipped servers, especially in terms of data storage space, are needed.

  • Three network interface cards per server are recommended, although two might work at the expense of some modifications and extra LAN traffic.

  • Sufficient network bandwidth is needed between the cluster nodes.

My system consists of two Dell PowerEdge 2300 Servers, each complete with:

  • three 3C905B 100BaseTX Ethernet cards

  • two 9GB Ultra SCSI 2 hard disks

  • one Pentium II 350 MHz CPU

Figure 3. Photograph of the Two-node Cluster



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

High-availability clusters

Emil Koutanov's picture

The problem with using Linux-based (or an OS-specific) clustering software is that you'll always be tied to the operating system.

The folks at Obsidian Dynamics have built a Java-based application-level clustering solution that isn't tied to the operating system.

I think this is the way forward, particularly seeing that many organisations are running a mixed bag of Windows and Linux servers - being able to cluster Windows and Linux machines together can be a real advantage. It also makes installation and configuration easier, since you're not supporting a dozen different operating systems and hardware configurations.

The other neat thing about Gridlock is that it doesn't use quorum and doesn't rely on NIC bonding/teaming to achieve multipath configurations - instead it combines redundant networks at the application level, which means it works on any network card and doesn't require specialised switchgear.

In connection with his article on A High-Availability Cluster

Steve Thompson's picture

Iam trying to get in touch with Mr Phil(Philip) Lewis over e-mail but i have the impression there is something wrong with the e-mail address.Can u confirm it.I have: lewispj@e-mail.com
Thanks in advance

Updated email

Anonymous's picture

You can contact me at:

linuxjournal (at sign) linuxcentre.net