Building a Two-Node Linux Cluster with Heartbeat
According to the accompanying documentation, you need to install a second NIC on both nodes and connect them with a cross overcable. Besides the second NIC, a null modem cable connecting the serial (com) ports of each node is mandatory (according to the documentation). I followed the instructions in the documentation and installed everything. However, as I did more tests on the cluster, I found that the null modem cable, crossover cable and the second NIC are optional; they are nice to have but definitely not mandatory.
Configuring Heartbeat is the most important part of the whole installation and must be set up correctly to get your cluster working. Moreover, it should be identical on both nodes. There are three configuration files, all stored under /etc/ha.d: ha.cf, haresource and aythkeys.
debugfile /var/log/ha-debug # # File to write other messages to # logfile /var/log/ha-log # # Facility to use for syslog()/logger # logfacility local0 # # keepalive: how many seconds between heartbeats # keepalive 2 # # deadtime: seconds-to-declare-host-dead # deadtime 10 udpport 694 # # What interfaces to heartbeat over? # udp eth0 # node atm1 node cluster1 # # ------> end of ha.cf
Whatever is not shown above, you can simply leave as it was (all commented out by the #). The last three options are most important:
udp eth0 # node atm1 node cluster1
Unless you have a cross cable, you should use your eth0 (your only NIC) for udp; the two nodes at the end of the above files must be the same as returned by uname -n from each node.
atm1 IPaddr::192.168.1.4 httpd smb dhcpd
This is the only line you need; in the above example, I included httpd, smb and dhcpd. You may add as many dæmons as you want, provided they have the exact same spelling as those dæmons under /etc/rc.d/init.d
You don't need to add anything to this file, but you have to issue the command
chmod 600 /etc/ha.d/authkeys
You may start the dæmon with
service heartbeat start
Once heartbeat is started on both nodes, you will find that the ifconfig from the primary server will return something like:
node1 ifconfig for node1 eth0 Link encap:Ethernet HWaddr 00:60:97:9C:52:28 inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:18617 errors:0 dropped:0 overruns:0 frame:0 TX packets:14682 errors:0 dropped:0 overruns:0 carrier:0 collisions:3 txqueuelen:100 Interrupt:10 Base address:0x6800 eth0:0 Link encap:Ethernet HWaddr 00:60:97:9C:52:28 inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:10 Base address:0x6800 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:3924 Metric:1 RX packets:38 errors:0 dropped:0 overruns:0 frame:0 TX packets:38 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
When you see the line eth0:0, heartbeat is working, and you can try to access the server by using http://192.168.1.4 and check the log files /var/log/ha-log. Also, check the log file on node2 (192.168.1.3) and try
ps -A | grep dhcpd
and you should find no running dhcpd on node2.
Now, the real HA test. Reboot, and then shut down the primary server (node1: 192.168.1.2). Don't just power down the server; make sure you issue reboot or press CTL-ALT-DEL and wait until everything is shut down properly before you turn off your PC.
Within ten seconds, go to node2 and try ifconfig. If you can get the IP aliasing eth0:0, you are in business and have a working HA two-node cluster.
eth0 Link encap:Ethernet HWaddr 00:60:08:26:B2:A4 inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:15673 errors:0 dropped:0 overruns:0 frame:0 TX packets:17550 errors:0 dropped:0 overruns:0 carrier:0 collisions:2 txqueuelen:100 Interrupt:10 Base address:0x6700 eth0:0 Link encap:Ethernet HWaddr 00:60:08:26:B2:A4 inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:15673 errors:0 dropped:0 overruns:0 frame:0 TX packets:17550 errors:0 dropped:0 overruns:0 carrier:0 collisions:2 txqueuelen:100 Interrupt:10 Base address:0x6700
You can try
ps -A | grep dhcpd
or you can try to release and renew the IP info on your Win9x workstation, and you should see the new address for the dhcpd server.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
- Weechat, Irssi's Little Brother
- New Products
- Tech Tip: Really Simple HTTP Server with Python
- Poul-Henning Kamp: welcome to
1 hour 55 min ago
- This has already been done
1 hour 56 min ago
- Reply to comment | Linux Journal
2 hours 42 min ago
- Welcome to 1998
3 hours 30 min ago
- notifier shortcomings
3 hours 54 min ago
5 hours 31 min ago
- Android User
5 hours 32 min ago
- Reply to comment | Linux Journal
7 hours 26 min ago
10 hours 15 min ago
- This is a good post. This
15 hours 28 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?