Building a Two-Node Linux Cluster with Heartbeat
Commercial products from Red Hat, TurboLinux and PolyServe use the same concept of IP aliasing. When the primary server goes down, the backup server will pick up the same aliasing IP so that high availability can be achieved.
The cluster product from PolyServe is very sophisticated. It has support on SAN (server area network) and is capable of more than two nodes. It is very easy to install and easy to configure. I successfully configured the trial version without reading any documentation through a windows monitoring client. However, sophistication comes with a price tag, and the software costs more than a thousand dollars for a two-node cluster. The 30-day trial version cluster will stop after two hours, and it is not much fun for testing.
The cluster product from TurboLinux needs some fine-tuning. The installation documentation is confusing (or maybe they simply don't want people to do-it-themselves). The web configuration tool is unstable; the cgi script will crash whenever the user clicks the reload or refresh button. And of course, as a commercial product, it comes with a high price tag.
Linux is very stable and reliable, and it is quite common to have our servers up and running for a few hundred days at a time. Heartbeat works fine in my tests, and if you are looking for a product with higher availability for a small business or education institution, Heartbeat is definitely a perfect option.
email: leung@uwinnipeg.ca
- « first
- ‹ previous
- 1
- 2
- 3
- 4
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Keeping track of IP address
1 hour 23 min ago - Roll your own dynamic dns
6 hours 36 min ago - Please correct the URL for Salt Stack's web site
9 hours 48 min ago - Android is Linux -- why no better inter-operation
12 hours 3 min ago - Connecting Android device to desktop Linux via USB
12 hours 32 min ago - Find new cell phone and tablet pc
13 hours 30 min ago - Epistle
14 hours 59 min ago - Automatically updating Guest Additions
16 hours 7 min ago - I like your topic on android
16 hours 54 min ago - This is the easiest tutorial
23 hours 29 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?



Comments
Linux Cluster Manager:
Linux Cluster Manager is a graphical tool for managing multiple Linux systems from a central location.I have a problem installing this tool,i think is a interesting monitoring tool but the installation document is not well organised.is there someone who installed this tool and have a clear step by step installation details.
this is the link about the tool.
http://linuxcm.sourceforge.net/
No doubt it's a beautiful
No doubt it's a beautiful post of yours and you hve done a great job.
Thanks for sharing it with us.
cna certification
For best fireplace designs
For best fireplace designs please visit
Fireplace Designs
Heartbeat is not failingover when I stop the application.
Greetings:
I have configured MQ HA with heartbeat, we are running on Red Hat server. When I stop the heartbeat the failover is working fine, but when I stop the MQ or httpd, which are in resource group, node is not failing over. I mean there is no reponse from heartbeat, simply the application is stopping.
I am starting the heartbeat from /etc/rc.d/init.d/heartbeat start
How do I monitor the application health, if there is any problem with that application it should failover to the next node.
The httpd is default application with Redhat, where we can check the status with /etc/rc.d/init.d/httpd status.
When I stopped it is showing me as stopped or not running.
Do I need to do any OS configuration to keep the heartbeat always check the applications in the resources. I am new to Linux admin.
Thanks,
Chandra.
our problem is the same;
our problem is the same; maybe the server just works fine, but what if apache or myswl server do not?
will hearthbeat sense it and take servis over to other node?
if yes, pls tell me how? or send mi a link thjat explains how to configure it.
thanks.
thanks
thanks for tihs information :)eglen
thanks for this
thanks for this information.
cna certification
Hi, Is it possible to have a
Hi,
Is it possible to have a 2 Nodes running different versions of Heartbeat(1.x/2.x) and different RHEL versions(RHEL3/RHEL4) to work well in tandem? Does failover and other heartbeat functionalities work fine in such a linux cluster?
Thanks in advance!!
Works
Works for us and for many many others :-)
so far it has been working
so far it has been working for us with a few glitches.
patio furniture
Hello, I'm developing a
Hello,
I'm developing a system for high availability and load balancing under Linux, with heartbeat, ldirectord, glusterfs, mon, MySQL Cluster, ... You can see the results in my blog:
http://redes-privadas-virtuales.blogspot.com/2008/12/alta-disponibilidad...
hi, i'm sure that you have
hi,
i'm sure that you have explained very well, but there is a big problem (for me) your blog is in spanish :) so i cannot understand.
well, you may say "that is your problem" yes that is true :)
can you please write a good "how to" in english pls.
thanks
ha.cf
This is the main ha.cf file.
ha.cf:
-------
logfile /var/log/ha-log
keepalive 2
deadtime 10
warntime 5
initdead 30
bcast eth1
auto_failback off
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
node A
node B
ping 192.168.1.1 ( this is gateway IP)
heartbeat VIP failover is not happening.
Hi,
Thanks for providing such a good article.
Iam trying to setup heartbeat with the following configuration.
However,Iam having issue in VIP failover once the node1 is down.
heartbeat is not allocating the shared VIP on node2 in case node1 is down.
A:Master
eth1 = 192.168.100.2
B:Slave
eth1 = 192.168.100.4
on A eth1:2 = 192.168.100.24 (virtual interface for heartbeat)
ha.cf:
-------
logfile /var/log/ha-log
keepalive 2
deadtime 10
warntime 5
initdead 30
ucast eth1
auto_failback on
respawn hacluster /usr/lib/heartbeat/ipfail
node A
node B
ping 192.168.1.1 ( this is a gateway IP)
haresource:
============
B 192.168.223.24 mysql
authkeys:
==========
auth 2
2 sha1 Test_HB!
Node: A (3f36b1d6-90c0-4f61-9d76-f5bedee43c12): online
Node: B (7caa9321-001d-473c-b505-7081e4ec4d7f): online
Can someone plz help in resolving the issue.
Iam unable to find the reason for failure.
Thankyou.
eth0:0
How did you setup eth0:0? Is that a bridged eth0 interface? In any case how did you accomplish that?
Thanks in advance.
Heartbeat with Tcp based Application
Hi, my name is Cantek. I am from Turkey. I am trying to make a server application without application server softwares. My server will only use tcp sockets and recieve bytes and send bytes not anything more. I need a clustering and failover mechanism for my server side. I am using two ubuntu server, both has server application, during my researches i saw heartbeat so many.
What i need is ;
When my server one fails server 2 must be keep doing the job exactly at the same point where server2 failed.
Session and state informations must be kept.
If you can help me about that subject i will be very happy.
Good Days.
Sorry
: ) sorry i don't have any suggestion for you in this matter .
songs.pk
Love ya
Pooja
When my server one fails
When my server one fails server 2 must be keep doing the job exactly at the same point where server1 failed.
Building 2node clus !!!
Hi,
Thanks a lot for the notes !!!
Had a query though...
Whats the configuration for the second set of NICs on both the nodes.
Secondly what kind of an installation should i go for???
Cluster package in redhat or normal redhat server installation.
I have RHEL 5 , will that go fine
Regards,
Johny:)
Article has Incorrect IP Address for Node 2
The ifconfig listings for node 2 I believe are incorrect. The inet address for eth0 should be 192.168.1.3 *not* 192.168.1.2.
Sharing the same address as node 1 would create a conflict.
Re: Article has Incorrect IP Address for Node 2
I think so too. Please confirm anyone.
HA or Cluster not DHCP
heartbeat is excellent. If you do this for Apache, use ldirectord for load balancing and redundancy then use heartbeat to have ldirectord on two servers for fail-over redundancy.
ISC DHCP already supports multiple nodes in fail-over mode and is somewhat load balanced.
Defrag does not recover lost clusters.
SAN stands for Storage Area Network not Server Area Network.
The discussion about the MAC/ARP on a switch is incorrect and there is no need to run this set up on a hub.
Installation Head Acks!
Hi All,
I am a student and I am working on a project for my class. I have no prier experience with Linux (which is one of the requirements for the project). I have two computers with Red Hat Ver. 9 and I have tryed to start the installation process of HeartBeat on one of them.
I have done several RPMs but have run into a snag.
When trying to do the heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1.i386.rpm I recieve this error:
error: Failed dependencies:
perl-Mail-IMAPClient is neede by heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1
Perl-Net-DNS is neede by heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1
perl-ldap is neede by heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1
Understand that this is a lack of previous RPMs not being available for up and coming RPMs (or least I think I do), but I have the "Mail" and "ldap" RPMs and get similular results when I try to RPM them. The don't know where the "Net-DNS" is. Any help would be greatly appriciated. Thank you.
Installing Perl Packages
RPM's are the devils work.
Load up cpan
> cpan
install Net::DNS
...
lather, rinse repeat as you need.
Re: Building a Two-Node Linux Cluster with Heartbeat
no comment !
Re: Building a Two-Node Linux Cluster with Heartbeat
When the secondary node in this type of failover happens, shouldn't the virtual IP on the new node have the same physical (MAC) address of the old one? If not, won't this confuse the hell out of the switch that it's attached to?
Re: Building a Two-Node Linux Cluster with Heartbeat
Thats why an arp change broadcast is issued by hearbeat :)
Re: Building a Two-Node Linux Cluster with Heartbeat
I thought that it would confuse the switch, but it sends an arp command to release the ip address before taking over.
Re: Building a Two-Node Linux Cluster with Heartbeat
You should only use hubs on this kind of setup and NOT a switch.
Heartbeat just sends out gratuitous arps to get the IP takeover so all the other machines just renew their routing tables.
Obviously using a switch would 'cause drastic problems that you are mentioning here.
No, you can use also .
No,
you can use also ....switch also...main concern all things should be physically connected.
regards,
Vipul Ramani
Re: Building a Two-Node Linux Cluster with Heartbeat
The HUB should be used, of course!
Re: Building a Two-Node Linux Cluster with Heartbeat
At any time, there will be just one virtual IP. If the
master node is up, virtual IP is tied to the
master node; however, when the master node is
down, the secondary node will take over the virtual
IP. The process is just like assigning an IP
address to different servers at DIFFERENT time.
As far as the switch is concerned, it will only
keep the MAC address for a certain period of
time, after checking that virtual IP has been
reassigned to another server with a different
MAC address, it will simply update its own
MAC lookup table. In this case, I don't think it
will confuse anyone or anything such as the
switch.
Re: Building a Two-Node Linux Cluster with Heartbeat
Depends on your network equipment. Heartbeat (or fake) can be set up to do gratious arp for the address which just have moved. Most equipment will simply update the ARP cache and work with the new NIC. Of course you have to check if your switch and router are happy with that.
This is the most common system for failover clusters, since in this scenario network outages are usually happen anyway.
BTW: heartbeat and fake work best if you have static applications, for replication you may better use shared storage clusters like kimberlite. For some applications, especially web application servers it is much better to run them in a load balancing configuration, cause the failover time is smaller and the hardware is better used.
Re: Building a Two-Node Linux Cluster with Heartbeat
If you want to see a small sampling of the kinds of mission critical applications people doing on heartbeat, then you might want to look here:
http://linux-ha.org/heartbeat/users.html
Heartbeat has been in production for several years, and is in use in hundreds of mission-critical sites across the world.
Re: Building a Two-Node Linux Cluster with Heartbeat
My thanks to CT on the excellent article and to the replies adding useful information. Question --- Using Samba, Pentium 3's, 1GB RAM, 1GHz Processor, how many users can this safely handle? I/we initially plan to use it to serve approx six MS apps that require frequent upgrades/patches to approx 300 users. Thanks again, art557@pacbell.net
Re: Building a Two-Node Linux Cluster with Heartbeat
I've been using Heartbeat for over a year to provide high-availability to a mission critical application.
The application employs a replicated (2-way) database, a servlet container, and several batch processes.
I've augmented Heartbeat with a program that monitors all critical resources needed by the application. It initiates a Heartbeat failover should any of these resources become unuseable.
This combination has worked flawlessly.
Re: Building a Two-Node Linux Cluster with Heartbeat
Interesting, but without replication or completely static data, not much use other than a toy with which to play.Reasons for alternate nics with crossover cable OR com cable: if the network device both nodes are connected to has a problem, and comes back after that 10 seconds (say some bozo kicks the power plug out on the switch), now you've got 2 nodes reacting to the same ip address and unpredictable results to say the least. The crossover or com connection provides a non-DOS-able failsafe.
Re: Building a Two-Node Linux Cluster with Heartbeat
"a toy with which to play" ? Never installed it have you ?
Oh, and spilt clusters is the reason why we use separate switches for each machine AND pass a heartbeat signal over the same network as the clients see the service over .....
So, no SPOF.
Works for us and for many many others :-)
Re: Building a Two-Node Linux Cluster with Heartbeat
Many people use heartbeat in combination with shared storage, a general purpose replication mechanism like DRBD, or a application-specific replication mechanism like comes with LDAP or DNS. With replication, even very cheap hardware can be made to avoid all SPOFs. Heartbeat integrates well with DRBD to provide general partition-layer replication of data.
Certainly you want more than one heartbeat connection for lots of different reasons.
There are hundreds of production users of heartbeat all over the world. For a few examples see:
http://linux-ha.org/heartbeat/users.html
Re: Building a Two-Node Linux Cluster with Heartbeat
It is true that replication of dynamic data is an important issue. Howerver, this example serves the purpose of creating the lowest layer of a HA system, namely knowing when to switch to node2. The next layer would be straight software driven replication built on top of this example failover design.
An alternative to replication could be storage shared between the two nodes. I wonder if someone out there would have an inexpensive shared storage design?
Re: Building a Two-Node Linux Cluster with Heartbeat
The CVS version has of heartbeat has just added support for the IBM ServeRAID RAID controllers so that two machines can share a SCSI string and fail over (correctly) between the two controllers. These RAID controllers guarantee that only one side at a time will access any given logical volume. I think the retail on these devices starts at around $600 USD. It's a lot cheaper than a FC box.
how come the ServeRAID can
How come the ServeRAID can guarantee that controller only one node access the shared storage? while the other node (while first time booting can't indentified the logical volume of enclosure?
if you have any expereience, would you like to share how to setup this hardware, I would like to setup Oracle 10g RAC using IBM x346, ServeRaid 6M, EXP400 on linux+heartbeat.
Warm Regards,
Re: Building a Two-Node Linux Cluster with Heartbeat
This tools has worked almost flawlessly at our site for the last year. It turned our Samba and FTP servers into highly-available services for the cost of a couple of obsolete PCs.
Now we're just left with Heartbeat Clusters backended by EMC Celerras.
Re: Building a Two-Node Linux Cluster with Heartbeat
Be careful running any process like DHCP in this scenario! If the DHCP database from the master server is not being replicated to the HA backup, whenever the HA backup takes over it will begin assigning IP Addresses to machines regardless of what the master was already assigning (ie: A station that gets a DHCP address after restart will probably get a duplicate IP address to a system already running).
Re: Building a Two-Node Linux Cluster with Heartbeat
You would probably not run a high-availability DHCP server. The DHCP protocol has redundancy built in, clients are expected to handle DCHPOffers from multiple servers on the same subnet. If your going to throw together the hardware for a failover node, you might as well make both active with distinct scopes.
DHCP RFC: http://www.ietf.org/rfc/rfc2131.txt?number=2131
Re: Building a Two-Node Linux Cluster with Heartbeat
this is why getting a scsi system going on each host linked to a common array is really needed. data can be stored on the array so it can be shared.
To be really useful "we" nned something like this.
regards
Re: Building a Two-Node Linux Cluster with Heartbeat
If you make /var/lib/dhcp a mount point, and mount it on a DRBD volume, then you should be in pretty good shape. Then the secondary machine will have access to the DHCP leases when it takes over. Of course, you want to make /var/lib/dhcp a journalling filesystem.
Re: Building a Two-Node Linux Cluster with Heartbeat
This sound more like a backup solution right? When I first saw the title, I thought it was talking about Beowulf, I guess I was wrong.
Re: Building a Two-Node Linux Cluster with Heartbeat
It's about a high avaliabilty cluster.
In this case in his minimal expression since the second node is no more than a hot stand-by (or hot backup if you prefer it).
However with not too much complication you can add load balancing capabilities to this set up gaining a lot more.
Re: Building a Two-Node Linux Cluster with Heartbeat
Most Windows users are probably familiar with lost clusters--something that can be rectified by running the defrag utility.
Now... Running defrag to recover lost clusters isn't a good idea. I'd rather run scandisk, or in the good old days, chkdsk /f. But then again, I don't use Microsoft products any longer ;P