A High-Availability Cluster for Linux
The administrator can configure groups of files which are mirrored together by creating small mirror description files in the configuration directory. Note directory entries must end with a forward slash (/). Below is the description file for my lpd mirroring, /etc/cluster.d/serv1/Flpd:
The frequency of the mirroring is then controlled by an entry in the /etc/crontab file. Each entry executes the sync-app program, which examines the specified service mirror description file and mirrors the contents to the specified server IP address. In this example, the specified server address is the cross-over cable IP address on serv2. Mirroring of the lpd system is done every hour. These crontab entries are from serv1:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * root \ /usr/local/bin/sync-app /etc/cluster.d/serv1/Fsmbd \ serv2-hb 0 * * * * root /usr/local/bin/sync-app \ /etc/cluster.d/serv1/Flpd serv2-hb
The brains of the system lie in the cluster daemon, clusterd. This was written in the Bourne shell and will soon be rewritten in C. The algorithm outline is shown as a flow chart in Figure 2.
clusterd continuously monitors the ICMP (Internet control message protocol) reachability of the other node in the cluster, as well as a list of hosts which are normally reachable from each node. It does this using a simple ping mechanism with a timeout. If the other node becomes even partially unreachable, clusterd will decide which node actually has the failure by counting the number of hosts in the list which each node can reach. The node which can reach the fewest hosts is the one which gets put into standby mode. clusterd will then start the failover and takeover procedures on the working node. This node then continues to monitor whether the failed node recovers. When it does recover, clusterd controls the resynchronization procedure. clusterd is invoked on each node as:
clusterd <local-nodename> <remote-nodename>
It has to know which applications and services are running on each node so that it knows which ones to start and stop at failover and takeover time. This is defined in the same configuration directories as the service mirror description files discussed earlier. The configuration directories in each node are identical and mirrored across the whole cluster. This makes life easier for the cluster administrator as he can configure the cluster from a single designated node. Within the /etc/cluster.d/ directory, a nodename.conf file and a nodename directory exist for each node in the cluster. The reachlist file contains a list of reachable external hosts on the LAN. The contents of my /etc/cluster.d directory are shown here:
[root@serv1 /root]# ls -al /etc/cluster.d/ total 8 drwxr-xr-x 4 root root 1024 Nov 15 22:39 . drwxr-xr-x 23 root root 3072 Nov 22 14:27 .. drwxr-xr-x 2 root root 1024 Nov 4 20:30 serv1 -rw-r--r-- 1 root root 213 Nov 5 18:49 serv1.conf drwxr-xr-x 2 root root 1024 Nov 8 20:29 serv2 -rw-r--r-- 1 root root 222 Nov 22 22:39 serv2.conf -rw-r--r-- 1 root root 40 Nov 12 22:19 reachlistAs you can see, the two nodes are called serv1 and serv2. The configuration directory for serv1 has the following files: Fauth, Fclusterd, Fdhcpd, Flpd, Fnamed, Fradiusd, Fsmbd, K10radiusd, K30httpd, K40smb, K60lpd, K70dhcpd, K80named, S20named, S30dhcpd, S40lpd, S50smb, S60httpd and S90radiusd.
Files beginning with the letter F are service mirror description files. Those starting with S and K are linked to the SysVinit start/stop scripts and behave in a similar way to the files in the SysVinit run levels. The S services are started when node serv1 is in normal operation. The K services are killed when node serv1 goes out of service. The number following the S and K determine the order of starting and stopping the services. clusterd, running on node serv2, uses this same /etc/cluster.d/serv1/ directory to decide which services to start on serv2 when node serv1 has failed. It also uses the serv1 service mirror description files (those files starting with F) to determine which files and directories need to be mirrored back (resynchronized) to serv1 after it has recovered.
The configuration directory for node serv2 contains Fftpd, Fhttpd, Fimapd, Fsendmail, K60sendmail, K80httpd, K85named, K90inetd, S10inetd, S15named, S20httpd and S40sendmail. As you can see, the serv2 node normally runs Sendmail, named, httpd, IMAP4 and ftpd.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- The Qt Company's Qt Start-Up
- Devuan Beta Release
- May 2016 Issue of Linux Journal
- EnterpriseDB's EDB Postgres Advanced Server and EDB Postgres Enterprise Manager
- Open-Source Project Secretly Funded by CIA
- The US Government and Open-Source Software
- The Death of RoboVM
- The Humble Hacker?
- New Container Image Standard Promises More Portable Apps
- BitTorrent Inc.'s Sync
In modern computer systems, privacy and security are mandatory. However, connections from the outside over public networks automatically imply risks. One easily available solution to avoid eavesdroppers’ attempts is SSH. But, its wide adoption during the past 21 years has made it a target for attackers, so hardening your system properly is a must.
Additionally, in highly regulated markets, you must comply with specific operational requirements, proving that you conform to standards and even that you have included new mandatory authentication methods, such as two-factor authentication. In this ebook, I discuss SSH and how to configure and manage it to guarantee that your network is safe, your data is secure and that you comply with relevant regulations.Get the Guide