Building a Two-Node Linux Cluster with Heartbeat

March 11th, 2002 by C T Leung in

C T shows you how to set up a two-node Linux cluster with Heartbeat.
Your rating: None

The term "cluster" is actually not very well defined and could mean different things to different people. According to Webopedia, cluster refers to a group of disk sectors. Most Windows users are probably familiar with lost clusters--something that can be rectified by running the defrag utility.

However, at a more advanced level in the computer industry, cluster usually refers to a group of computers connected together so that more computer power, e.g., more MIPS (millions instruction per second), can be achieved or higher availability (HA) can be obtained.

Beowulf, Super Computer for the "Poor" Approach

Most super computers in the world are built on the concept of parallel processing--high-speed computer power is achieved by pulling the power from each individual computer. Made by IBM, "Deep Blue", the super computer that played chess with the world champion Garry Kasprov, was a computer cluster that consisted of several hundreds of RS6000s. In fact, many big time Hollywood movie animation companies, such as Pixar, Industrial Light and Magic, use computer clusters extensively for rendering (a process to translate all the information such as color, movement, physical properties, etc., into a single frame of picture).

In the past, a super computer was an expensive deluxe item that only few universities or research centers could afford. Started at NASA, Beowulf is a project of building clusters with "off-the-shelf" hardware (e.g., Pentium PCs) running Linux at a very low cost.

In the last several years, many universities world-wide have set up Beowulf clusters for the purpose of scientific research or simply for exploration of the frontier of super computer building.

High Availability (HA) Cluster

Clusters in this category use various technologies to gain an extra level of reliability for a service. Companies such as Red Hat, TurboLinux and PolyServe have cluster products that would allow a group of computers to monitor each other; when a master server (e.g., a web server) goes down, a secondary server will take over the services, similar to "disk mirroring" among servers.

Simple Theory

Because I do not have access to more than one real (or public) IP address, I set up my two-node cluster in a private network environment with some Linux servers and some Win9x workstations.

If you have access to three or more real/public IP addresses, you can certainly set up the Linux cluster with real IP addresses.

In the above network diagram (fig1.gif), the Linux router is the gateway to the Internet, and it consists of two IP addresses. The real IP, 24.32.114.35, is attached to a network card (eth1) in the Linux router and should be connected to either an ADSL modem or a cable modem for internet access.

The two-node Linux router consists of node1 (192.168.1.2) and node2 (192.168.1.3). Depending on your setup, either node1 or node2 can be your primary server, and the other will be your backup server. In this example, I will choose node1 as my primary and node2 as my backup. Once the cluster is set, with IP aliasing (read IP aliasing from the Linux Mini HOWTO for more detail), the primary server will be running with an extra IP address (192.168.1.4). As long as the primary server is up and running, services (e.g., DHCP, DNS, HTTP, FTP, etc.) on node1 can be accessed by either 192.168.1.2 or 192.168.1.4. In fact, IP aliasing is the key concept for setting up this two-node Linux cluster.

When node1 (the primary server) goes down, node2 will be take over all services from node1 by starting the same IP alias (192.168.1.4) and all subsequent services. In fact, some services can co-exist between node1 and node2 (e.g., FTP, HTTP, Samba, etc.), however, a service such as DCHP can have only one single running copy on the same physical segment. Likewise, we can never have two identical IP addresses running on two different nodes in the same network.

In fact, the underlining principle of a two-node, high-availability cluster is quite simple, and people with some basic shell programming techniques could probably write a shell script to build the cluster. We can set up an infinite loop within which the backup server (node2) simply keeps pinging the primary server, if the result is unsuccessful, and then start the floating IP (192.168.1.4) as well as the necessary dæmons (programs running at the background).

A Two-Node Linux Cluster HOWTO with "Heartbeat"

You need two Pentium class PCs with a minimum specification of a 100MHz CPU, 32MB RAM, one NIC (network interface card), 1G hard drive. The two PCs need not be identical. In my experiment, I used an AMD K6 350M Hz and a Pentium 200 MMX. I chose the AMD as my primary server as it can complete a reboot (you need to do a few reboots for testing) faster than the Pentium 200. With the great support of CFSL (Computers for Schools and Libraries) in Winnipeg, I got some 4GB SCSI hard drives as well as some Adaptec 2940 PCI SCSI controllers. The old and almost obsolete equipment is in good working condition and is perfect for this experiment.

node1

  • AMD K6 350MHz cpu

  • 4G SCSI hard drive (you certainly can use IDE hard drive)

  • 128MB RAM

  • 1.44 Floppy drive

  • 24x CD-ROM (not needed after installation)

  • 3COM 905 NIC

node2

  • Pentium 200 MMX

  • 4G SCSI hard drive

  • 96MB RAM

  • 1.44 Floppy

  • 24x CD-ROM

  • 3COM 905 NIC

The Necessary Software

Both node1 and node2 must have Linux installed. I chose Red Hat and installed Red Hat 7.2 on node1 and Red Hat 6.2 on node2 (I simply wanted to find out if we could build a cluster with different versions of Linux installed on different nodes). Make sure you have installed all dæmons that you want to support. Here is my installation detail:

Hard disk partitions: 128MB for swap and the rest mounted for "/" (so that you don't need to worry about whether there is too much or not enough for a certain subdirectory).

Installed Packages:

  • Apache

  • FTP

  • Samba

  • DNS

  • dhcpd (server)

  • Squid

Heartbeat

Heartbeat is a part of Ultra Monkey (The Linux HA Project), and the RPM can be downloaded from www.UltraMonkey.org.

The download is small and RPM installation is smooth and simple. However, the document or HOWTO for configuration is hard to find and confusing. In fact, that is the reason I decided to write this HOWTO; so that hopefully you can get your cluster setup with less problems.

Setting Up the Primary Server (node1) and the Backup Server (node2)

It is not the purpose of this article to show you how to install Red Hat; a lot of excellent documentation can be found at either www.linuxdoc.org or www.redhat.com. I will simply include some of the most important configuration files for your reference:

/etc/hosts
127.0.0.1       localhost
192.168.1.1     router
192.168.1.2     node1
192.168.1.3     node2

This file should be the same on both node1 and node2; you may add any other nodes as you see fit.

Check HOSTNAME (cat /etc/HOSTNAME) and make sure it returns either node1 or node2. If not, you can use this command (uname -n > /etc/HOSTNAME) to fix the hostname problem.

ifconfig for node1

eth0      Link encap:Ethernet  HWaddr 00:60:97:9C:52:28  
          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:18617 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14682 errors:0 dropped:0 overruns:0 carrier:0
          collisions:3 txqueuelen:100 
          Interrupt:10 Base address:0x6800 
eth0:0    Link encap:Ethernet  HWaddr 00:60:97:9C:52:28  
          inet addr:192.168.1.4  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:10 Base address:0x6800 
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:3924  Metric:1
          RX packets:38 errors:0 dropped:0 overruns:0 frame:0
          TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0

Please notice that eth0:0 shows the IP aliasing with IP 192.168.1.4.

ifconfig for node2

eth0      Link encap:Ethernet  HWaddr 00:60:08:26:B2:A4  
          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:15673 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17550 errors:0 dropped:0 overruns:0 carrier:0
          collisions:2 txqueuelen:100 
          Interrupt:10 Base address:0x6700 
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:3924  Metric:1
          RX packets:142 errors:0 dropped:0 overruns:0 frame:0
          TX packets:142 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
Install the Heartbeat RPM

If you are using Internet Explorer on Windows, you might have problems accessing FTP (Netscape works much better). I suggest you either use a command-line FTP or an FTP Windows/X Window System client (e.g., wu_ftp) to access the FTP site of Ultra Monkey (ftp.UltraMonkey.org).

Once you log in to the FTP server of Ultra Monkey, go to pub, then UltraMonkey and then the latest version 1.0.2 (not the beta). The only package is heartbeat-0.4.9-1.um.1.i386.rpm; save heartbeat-0.4.9-1.um.1.i386.rpm on your Linux box, log in as root and install it with

rpm -ivh heartbeat-0.4.9-1.um.1.i386.rpm

Null Modem Cable, Crossover Cable, Second NIC

According to the accompanying documentation, you need to install a second NIC on both nodes and connect them with a cross overcable. Besides the second NIC, a null modem cable connecting the serial (com) ports of each node is mandatory (according to the documentation). I followed the instructions in the documentation and installed everything. However, as I did more tests on the cluster, I found that the null modem cable, crossover cable and the second NIC are optional; they are nice to have but definitely not mandatory.

Configuring Heartbeat is the most important part of the whole installation and must be set up correctly to get your cluster working. Moreover, it should be identical on both nodes. There are three configuration files, all stored under /etc/ha.d: ha.cf, haresource and aythkeys.

My /etc/ha.d/ha.cf

debugfile /var/log/ha-debug
#
#       File to write other messages to
#
logfile /var/log/ha-log
#
#       Facility to use for syslog()/logger 
#
logfacility     local0
#
#       keepalive: how many seconds between heartbeats
#
keepalive 2
#
#       deadtime: seconds-to-declare-host-dead
#
deadtime 10
udpport 694
#
#       What interfaces to heartbeat over?
#
udp     eth0
#
node    atm1
node    cluster1
#
# ------> end of ha.cf

Whatever is not shown above, you can simply leave as it was (all commented out by the #). The last three options are most important:

udp     eth0
#
node    atm1
node    cluster1

Unless you have a cross cable, you should use your eth0 (your only NIC) for udp; the two nodes at the end of the above files must be the same as returned by uname -n from each node.

My /etc/ha.d/haresources

atm1 IPaddr::192.168.1.4 httpd smb dhcpd

This is the only line you need; in the above example, I included httpd, smb and dhcpd. You may add as many dæmons as you want, provided they have the exact same spelling as those dæmons under /etc/rc.d/init.d

My /etc/ha.d/authkeys

You don't need to add anything to this file, but you have to issue the command

chmod 600 /etc/ha.d/authkeys

Start the Heartbeat Daemon

You may start the dæmon with

service heartbeat start

or

/etc/rc.d/init.d/heartbeat start

Once heartbeat is started on both nodes, you will find that the ifconfig from the primary server will return something like:

node1
ifconfig for node1
eth0      Link encap:Ethernet  HWaddr 00:60:97:9C:52:28  
          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:18617 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14682 errors:0 dropped:0 overruns:0 carrier:0
          collisions:3 txqueuelen:100 
          Interrupt:10 Base address:0x6800 
eth0:0    Link encap:Ethernet  HWaddr 00:60:97:9C:52:28  
          inet addr:192.168.1.4  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:10 Base address:0x6800 
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:3924  Metric:1
          RX packets:38 errors:0 dropped:0 overruns:0 frame:0
          TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0

When you see the line eth0:0, heartbeat is working, and you can try to access the server by using http://192.168.1.4 and check the log files /var/log/ha-log. Also, check the log file on node2 (192.168.1.3) and try

ps -A | grep dhcpd

and you should find no running dhcpd on node2.

Now, the real HA test. Reboot, and then shut down the primary server (node1: 192.168.1.2). Don't just power down the server; make sure you issue reboot or press CTL-ALT-DEL and wait until everything is shut down properly before you turn off your PC.

Within ten seconds, go to node2 and try ifconfig. If you can get the IP aliasing eth0:0, you are in business and have a working HA two-node cluster.

eth0      Link encap:Ethernet  HWaddr 00:60:08:26:B2:A4  
          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:15673 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17550 errors:0 dropped:0 overruns:0 carrier:0
          collisions:2 txqueuelen:100 
          Interrupt:10 Base address:0x6700 
eth0:0    Link encap:Ethernet  HWaddr 00:60:08:26:B2:A4  
          inet addr:192.168.1.4  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:15673 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17550 errors:0 dropped:0 overruns:0 carrier:0
          collisions:2 txqueuelen:100 
          Interrupt:10 Base address:0x6700 

You can try

ps -A | grep dhcpd

or you can try to release and renew the IP info on your Win9x workstation, and you should see the new address for the dhcpd server.

Commercial Products

Commercial products from Red Hat, TurboLinux and PolyServe use the same concept of IP aliasing. When the primary server goes down, the backup server will pick up the same aliasing IP so that high availability can be achieved.

The cluster product from PolyServe is very sophisticated. It has support on SAN (server area network) and is capable of more than two nodes. It is very easy to install and easy to configure. I successfully configured the trial version without reading any documentation through a windows monitoring client. However, sophistication comes with a price tag, and the software costs more than a thousand dollars for a two-node cluster. The 30-day trial version cluster will stop after two hours, and it is not much fun for testing.

The cluster product from TurboLinux needs some fine-tuning. The installation documentation is confusing (or maybe they simply don't want people to do-it-themselves). The web configuration tool is unstable; the cgi script will crash whenever the user clicks the reload or refresh button. And of course, as a commercial product, it comes with a high price tag.

Linux is very stable and reliable, and it is quite common to have our servers up and running for a few hundred days at a time. Heartbeat works fine in my tests, and if you are looking for a product with higher availability for a small business or education institution, Heartbeat is definitely a perfect option.

__________________________


Special Magazine Offer -- 2 Free Trial Issues!
Receive 2 free trial issues of Linux Journal as well as instant online access to current and past issues. There's NO RISK and NO OBLIGATION to buy. CLICK HERE for offer

Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.

Sorry, offer available in the US only. International orders, click here.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Building 2node clus !!!

On April 23rd, 2008 Johny Anthony (not verified) says:

Hi,

Thanks a lot for the notes !!!
Had a query though...

Whats the configuration for the second set of NICs on both the nodes.

Secondly what kind of an installation should i go for???
Cluster package in redhat or normal redhat server installation.

I have RHEL 5 , will that go fine

Regards,
Johny:)

Article has Incorrect IP Address for Node 2

On November 8th, 2006 Marcus (not verified) says:

The ifconfig listings for node 2 I believe are incorrect. The inet address for eth0 should be 192.168.1.3 *not* 192.168.1.2.

Sharing the same address as node 1 would create a conflict.

HA or Cluster not DHCP

On September 22nd, 2006 Jim Balcomb (not verified) says:

heartbeat is excellent. If you do this for Apache, use ldirectord for load balancing and redundancy then use heartbeat to have ldirectord on two servers for fail-over redundancy.

ISC DHCP already supports multiple nodes in fail-over mode and is somewhat load balanced.

Defrag does not recover lost clusters.

SAN stands for Storage Area Network not Server Area Network.

The discussion about the MAC/ARP on a switch is incorrect and there is no need to run this set up on a hub.

Installation Head Acks!

On August 5th, 2005 Anonymous (not verified) says:

Hi All,

I am a student and I am working on a project for my class. I have no prier experience with Linux (which is one of the requirements for the project). I have two computers with Red Hat Ver. 9 and I have tryed to start the installation process of HeartBeat on one of them.
I have done several RPMs but have run into a snag.
When trying to do the heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1.i386.rpm I recieve this error:

error: Failed dependencies:
perl-Mail-IMAPClient is neede by heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1
Perl-Net-DNS is neede by heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1
perl-ldap is neede by heartbeat-ldirectord-1.2.3.cvs.20050404-1.rh.rl.um.1

Understand that this is a lack of previous RPMs not being available for up and coming RPMs (or least I think I do), but I have the "Mail" and "ldap" RPMs and get similular results when I try to RPM them. The don't know where the "Net-DNS" is. Any help would be greatly appriciated. Thank you.

Installing Perl Packages

On November 10th, 2005 Anonymous (not verified) says:

RPM's are the devils work.

Load up cpan

> cpan

install Net::DNS

...

lather, rinse repeat as you need.

Re: Building a Two-Node Linux Cluster with Heartbeat

On August 13th, 2004 Anonymous says:

no comment !

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 17th, 2002 Anonymous says:

When the secondary node in this type of failover happens, shouldn't the virtual IP on the new node have the same physical (MAC) address of the old one? If not, won't this confuse the hell out of the switch that it's attached to?

Re: Building a Two-Node Linux Cluster with Heartbeat

On July 28th, 2004 Anonymous says:

Thats why an arp change broadcast is issued by hearbeat :)

Re: Building a Two-Node Linux Cluster with Heartbeat

On November 17th, 2003 Anonymous says:

I thought that it would confuse the switch, but it sends an arp command to release the ip address before taking over.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 3rd, 2003 Anonymous says:

You should only use hubs on this kind of setup and NOT a switch.

Heartbeat just sends out gratuitous arps to get the IP takeover so all the other machines just renew their routing tables.

Obviously using a switch would 'cause drastic problems that you are mentioning here.

No, you can use also .

On February 9th, 2005 Vipul (not verified) says:

No,
you can use also ....switch also...main concern all things should be physically connected.

regards,
Vipul Ramani

Re: Building a Two-Node Linux Cluster with Heartbeat

On October 25th, 2002 Anonymous says:

The HUB should be used, of course!

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 19th, 2002 Anonymous says:

At any time, there will be just one virtual IP. If the

master node is up, virtual IP is tied to the

master node; however, when the master node is

down, the secondary node will take over the virtual

IP. The process is just like assigning an IP

address to different servers at DIFFERENT time.

As far as the switch is concerned, it will only

keep the MAC address for a certain period of

time, after checking that virtual IP has been

reassigned to another server with a different

MAC address, it will simply update its own

MAC lookup table. In this case, I don't think it

will confuse anyone or anything such as the

switch.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 17th, 2002 eckes (not verified) says:

Depends on your network equipment. Heartbeat (or fake) can be set up to do gratious arp for the address which just have moved. Most equipment will simply update the ARP cache and work with the new NIC. Of course you have to check if your switch and router are happy with that.

This is the most common system for failover clusters, since in this scenario network outages are usually happen anyway.

BTW: heartbeat and fake work best if you have static applications, for replication you may better use shared storage clusters like kimberlite. For some applications, especially web application servers it is much better to run them in a load balancing configuration, cause the failover time is smaller and the hardware is better used.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

If you want to see a small sampling of the kinds of mission critical applications people doing on heartbeat, then you might want to look here:


http://linux-ha.org/heartbeat/users.html

Heartbeat has been in production for several years, and is in use in hundreds of mission-critical sites across the world.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

My thanks to CT on the excellent article and to the replies adding useful information. Question --- Using Samba, Pentium 3's, 1GB RAM, 1GHz Processor, how many users can this safely handle? I/we initially plan to use it to serve approx six MS apps that require frequent upgrades/patches to approx 300 users. Thanks again, art557@pacbell.net

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

I've been using Heartbeat for over a year to provide high-availability to a mission critical application.

The application employs a replicated (2-way) database, a servlet container, and several batch processes.

I've augmented Heartbeat with a program that monitors all critical resources needed by the application. It initiates a Heartbeat failover should any of these resources become unuseable.

This combination has worked flawlessly.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

Interesting, but without replication or completely static data, not much use other than a toy with which to play.Reasons for alternate nics with crossover cable OR com cable: if the network device both nodes are connected to has a problem, and comes back after that 10 seconds (say some bozo kicks the power plug out on the switch), now you've got 2 nodes reacting to the same ip address and unpredictable results to say the least. The crossover or com connection provides a non-DOS-able failsafe.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 14th, 2002 Anonymous says:

"a toy with which to play" ? Never installed it have you ?

Oh, and spilt clusters is the reason why we use separate switches for each machine AND pass a heartbeat signal over the same network as the clients see the service over .....

So, no SPOF.

Works for us and for many many others :-)

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

Many people use heartbeat in combination with shared storage, a general purpose replication mechanism like DRBD, or a application-specific replication mechanism like comes with LDAP or DNS. With replication, even very cheap hardware can be made to avoid all SPOFs. Heartbeat integrates well with DRBD to provide general partition-layer replication of data.

Certainly you want more than one heartbeat connection for lots of different reasons.

There are hundreds of production users of heartbeat all over the world. For a few examples see:

http://linux-ha.org/heartbeat/users.html

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 jsw (not verified) says:

It is true that replication of dynamic data is an important issue. Howerver, this example serves the purpose of creating the lowest layer of a HA system, namely knowing when to switch to node2. The next layer would be straight software driven replication built on top of this example failover design.

An alternative to replication could be storage shared between the two nodes. I wonder if someone out there would have an inexpensive shared storage design?

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

The CVS version has of heartbeat has just added support for the IBM ServeRAID RAID controllers so that two machines can share a SCSI string and fail over (correctly) between the two controllers. These RAID controllers guarantee that only one side at a time will access any given logical volume. I think the retail on these devices starts at around $600 USD. It's a lot cheaper than a FC box.

how come the ServeRAID can

On March 17th, 2006 Afif (not verified) says:

How come the ServeRAID can guarantee that controller only one node access the shared storage? while the other node (while first time booting can't indentified the logical volume of enclosure?
if you have any expereience, would you like to share how to setup this hardware, I would like to setup Oracle 10g RAC using IBM x346, ServeRaid 6M, EXP400 on linux+heartbeat.
Warm Regards,

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

This tools has worked almost flawlessly at our site for the last year. It turned our Samba and FTP servers into highly-available services for the cost of a couple of obsolete PCs.

Now we're just left with Heartbeat Clusters backended by EMC Celerras.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

Be careful running any process like DHCP in this scenario! If the DHCP database from the master server is not being replicated to the HA backup, whenever the HA backup takes over it will begin assigning IP Addresses to machines regardless of what the master was already assigning (ie: A station that gets a DHCP address after restart will probably get a duplicate IP address to a system already running).

Re: Building a Two-Node Linux Cluster with Heartbeat

On July 21st, 2003 Anonymous says:

You would probably not run a high-availability DHCP server. The DHCP protocol has redundancy built in, clients are expected to handle DCHPOffers from multiple servers on the same subnet. If your going to throw together the hardware for a failover node, you might as well make both active with distinct scopes.

DHCP RFC: http://www.ietf.org/rfc/rfc2131.txt?number=2131

Re: Building a Two-Node Linux Cluster with Heartbeat

On May 3rd, 2002 Anonymous says:

this is why getting a scsi system going on each host linked to a common array is really needed. data can be stored on the array so it can be shared.

To be really useful "we" nned something like this.

regards

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 13th, 2002 Anonymous says:

If you make /var/lib/dhcp a mount point, and mount it on a DRBD volume, then you should be in pretty good shape. Then the secondary machine will have access to the DHCP leases when it takes over. Of course, you want to make /var/lib/dhcp a journalling filesystem.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 12th, 2002 ghostdancer (not verified) says:

This sound more like a backup solution right? When I first saw the title, I thought it was talking about Beowulf, I guess I was wrong.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 18th, 2002 Anonymous says:

It's about a high avaliabilty cluster.

In this case in his minimal expression since the second node is no more than a hot stand-by (or hot backup if you prefer it).

However with not too much complication you can add load balancing capabilities to this set up gaining a lot more.

Re: Building a Two-Node Linux Cluster with Heartbeat

On March 12th, 2002 Anonymous says:

Most Windows users are probably familiar with lost clusters--something that can be rectified by running the defrag utility.

Now... Running defrag to recover lost clusters isn't a good idea. I'd rather run scandisk, or in the good old days, chkdsk /f. But then again, I don't use Microsoft products any longer ;P

Re: defrag?

On March 13th, 2002 Anonymous says:

I have never even heard of running defrag for a windows disk problem and no one I know has ever run defrag for that purpose. Defrag is for consolidating data not fixing errors. Why would you even suggest doing this???? Scandisk is the standard tool that everyone uses and runs automatically most times you have a lockup or improper shutdown.

Please for anyone reading the above message do NOT run defrag if you are getting disk errors. Run scandisk. If you continue to have problems your disk may be going bad.

Re: defrag?

On March 19th, 2002 Anonymous says:

I try not to use Windows, but the last time I ran defrag (win98), scandisk was first invoked automatically.

I also thought this article would be about clusters (a la Beowolf), at least based on the title. Anyway, It was easy to read and I learned something new!

(linux.com wouldn't let me log in)

apache cluster on san

On April 18th, 2005 gops (not verified) says:

hi,

ok heartbeat works great - really appreciated ! and
i am amazed the way it worked perfectly.
how do we mount the san partition for that IP service in active-active apache cluster.
do i make an entry in /etc/fstab or is that a sin ?
I mea i need to mount /var/www/html on /dev/sdc1 . so i do i mak a entry in fstab file ?

gops

apache cluster on san

On April 18th, 2005 gops (not verified) says:

hi,

ok heartbeat works great - really appreciated ! and
i am amazed the way it worked perfectly.
how do we mount the san partition for that IP service in active-active apache cluster.
do i make an entry in /etc/fstab or is that a sin ?

gops

No Separate Storage?

On April 26th, 2006 Data Sheet (not verified) says:

Hi,
so it means I have two Linux computers, I can have Cluster installed. So I do not need any separate storage device right?

How shared storage work in this case?

Thanks,
Data Sheet

Mistaken config

On July 29th, 2007 Anonymous (not verified) says:

Why do you assign in hosts file "node1" and "node2" if you will use "atm1" and "cluster1" ?

if you choose this way "auth errors" will happen, all clusters will be in master role, it is conflict, so assign as below

in ha.cf on both clusters
node node1
node node2

in haresources on both clusters (if you want node1 to be master)

node1 virtualaddress httpd

and you must set authkeys files the same on all clusters otherwise it will not start.

auth 1
1 md5 "mysecret"

and for professional HA cluster we need null modem cable, on NIC perspective SPOFs will load extra bandwidth. For storage networks that is not permittable.

Thanks,

Featured Videos

The X Window System is a magnificent platform for many uses, but using it to run an application over a slow network is nearly impossible. This is an introduction to NX, a technology that makes remote applications fly even over commodity internet.

Linux Journal Gadget Guy, Shawn Powers, reviews the Flip Video Ultra, a small portable video camera, and shows us how easy it is to edit the video with Kino.

Thanks to our sponsor: Silicon Mechanics

From the Magazine

September 2008, #173

Feeling a bit like a Thermian? Never give up, never surrender! Someday, you could go from underdog to top dog. Just take a look at a few of the underdogs we highlight in this issue: Mutt, djbdns, Nginix, Gentoo, Xara and the program voted mostly likely to fail just a few years back—Firefox. If Firefox not radical enough for you, check out Chef Marcel's column for some more alternatives. Having trouble mapping your program data to your relational database? If so, Rueven Lerner shows you some tricks in his At The Forge column.

Need to run GUI applications on your server in the next state? In his Paranoid Penguin column, Mick Bauer shows you how to do it securely. Kyle Rankin keeps hacking and slashing and shows you a few split screen secrets you may not be familiar with. Finally, we all know what happens next February, but only Doc knows what happens afterward.

Read this issue