Building a Two-Node Linux Cluster with Heartbeat
According to the accompanying documentation, you need to install a second NIC on both nodes and connect them with a cross overcable. Besides the second NIC, a null modem cable connecting the serial (com) ports of each node is mandatory (according to the documentation). I followed the instructions in the documentation and installed everything. However, as I did more tests on the cluster, I found that the null modem cable, crossover cable and the second NIC are optional; they are nice to have but definitely not mandatory.
Configuring Heartbeat is the most important part of the whole installation and must be set up correctly to get your cluster working. Moreover, it should be identical on both nodes. There are three configuration files, all stored under /etc/ha.d: ha.cf, haresource and aythkeys.
My /etc/ha.d/ha.cf
debugfile /var/log/ha-debug # # File to write other messages to # logfile /var/log/ha-log # # Facility to use for syslog()/logger # logfacility local0 # # keepalive: how many seconds between heartbeats # keepalive 2 # # deadtime: seconds-to-declare-host-dead # deadtime 10 udpport 694 # # What interfaces to heartbeat over? # udp eth0 # node atm1 node cluster1 # # ------> end of ha.cf
Whatever is not shown above, you can simply leave as it was (all commented out by the #). The last three options are most important:
udp eth0 # node atm1 node cluster1
Unless you have a cross cable, you should use your eth0 (your only NIC) for udp; the two nodes at the end of the above files must be the same as returned by uname -n from each node.
My /etc/ha.d/haresources
atm1 IPaddr::192.168.1.4 httpd smb dhcpd
This is the only line you need; in the above example, I included httpd, smb and dhcpd. You may add as many dæmons as you want, provided they have the exact same spelling as those dæmons under /etc/rc.d/init.d
My /etc/ha.d/authkeys
You don't need to add anything to this file, but you have to issue the command
chmod 600 /etc/ha.d/authkeys
You may start the dæmon with
service heartbeat start
or
/etc/rc.d/init.d/heartbeat start
Once heartbeat is started on both nodes, you will find that the ifconfig from the primary server will return something like:
node1
ifconfig for node1
eth0 Link encap:Ethernet HWaddr 00:60:97:9C:52:28
inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:18617 errors:0 dropped:0 overruns:0 frame:0
TX packets:14682 errors:0 dropped:0 overruns:0 carrier:0
collisions:3 txqueuelen:100
Interrupt:10 Base address:0x6800
eth0:0 Link encap:Ethernet HWaddr 00:60:97:9C:52:28
inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:10 Base address:0x6800
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:3924 Metric:1
RX packets:38 errors:0 dropped:0 overruns:0 frame:0
TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
When you see the line eth0:0, heartbeat is working, and you can try to access the server by using http://192.168.1.4 and check the log files /var/log/ha-log. Also, check the log file on node2 (192.168.1.3) and try
ps -A | grep dhcpd
and you should find no running dhcpd on node2.
Now, the real HA test. Reboot, and then shut down the primary server (node1: 192.168.1.2). Don't just power down the server; make sure you issue reboot or press CTL-ALT-DEL and wait until everything is shut down properly before you turn off your PC.
Within ten seconds, go to node2 and try ifconfig. If you can get the IP aliasing eth0:0, you are in business and have a working HA two-node cluster.
eth0 Link encap:Ethernet HWaddr 00:60:08:26:B2:A4
inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15673 errors:0 dropped:0 overruns:0 frame:0
TX packets:17550 errors:0 dropped:0 overruns:0 carrier:0
collisions:2 txqueuelen:100
Interrupt:10 Base address:0x6700
eth0:0 Link encap:Ethernet HWaddr 00:60:08:26:B2:A4
inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15673 errors:0 dropped:0 overruns:0 frame:0
TX packets:17550 errors:0 dropped:0 overruns:0 carrier:0
collisions:2 txqueuelen:100
Interrupt:10 Base address:0x6700
You can try
ps -A | grep dhcpd
or you can try to release and renew the IP info on your Win9x workstation, and you should see the new address for the dhcpd server.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- RSS Feeds
- What's the tweeting protocol?
- Trying to Tame the Tablet
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal is an Awesome CMS and a Crappy development framework
22 min 19 sec ago - IT industry leaders
2 hours 44 min ago - Reply to comment | Linux Journal
19 hours 33 min ago - Reply to comment | Linux Journal
22 hours 5 min ago - Reply to comment | Linux Journal
23 hours 22 min ago - great post
23 hours 57 min ago - Google Docs
1 day 20 min ago - Reply to comment | Linux Journal
1 day 5 hours ago - Reply to comment | Linux Journal
1 day 5 hours ago - Web Hosting IQ
1 day 7 hours ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
Re: defrag?
I have never even heard of running defrag for a windows disk problem and no one I know has ever run defrag for that purpose. Defrag is for consolidating data not fixing errors. Why would you even suggest doing this???? Scandisk is the standard tool that everyone uses and runs automatically most times you have a lockup or improper shutdown.
Please for anyone reading the above message do NOT run defrag if you are getting disk errors. Run scandisk. If you continue to have problems your disk may be going bad.
Re: defrag?
I try not to use Windows, but the last time I ran defrag (win98), scandisk was first invoked automatically.
I also thought this article would be about clusters (a la Beowolf), at least based on the title. Anyway, It was easy to read and I learned something new!
(linux.com wouldn't let me log in)
apache cluster on san
hi,
ok heartbeat works great - really appreciated ! and
i am amazed the way it worked perfectly.
how do we mount the san partition for that IP service in active-active apache cluster.
do i make an entry in /etc/fstab or is that a sin ?
I mea i need to mount /var/www/html on /dev/sdc1 . so i do i mak a entry in fstab file ?
gops
apache cluster on san
hi,
ok heartbeat works great - really appreciated ! and
i am amazed the way it worked perfectly.
how do we mount the san partition for that IP service in active-active apache cluster.
do i make an entry in /etc/fstab or is that a sin ?
gops
No Separate Storage?
Hi,
so it means I have two Linux computers, I can have Cluster installed. So I do not need any separate storage device right?
How shared storage work in this case?
Thanks,
Data Sheet
Mistaken config
Why do you assign in hosts file "node1" and "node2" if you will use "atm1" and "cluster1" ?
if you choose this way "auth errors" will happen, all clusters will be in master role, it is conflict, so assign as below
in ha.cf on both clusters
node node1
node node2
in haresources on both clusters (if you want node1 to be master)
node1 virtualaddress httpd
and you must set authkeys files the same on all clusters otherwise it will not start.
auth 1
1 md5 "mysecret"
and for professional HA cluster we need null modem cable, on NIC perspective SPOFs will load extra bandwidth. For storage networks that is not permittable.
Thanks,