DRBD in a Heartbeat
About three years ago, I was planning a new server setup that would run our new portal as well as e-mail, databases, DNS and so forth. One of the most important goals was to create a redundant solution, so that if one of the servers failed, it wouldn't affect company operation.
I looked through a lot of the redundant solutions available for Linux at the time, and with most of them, I had trouble getting all the services we needed to run redundantly. After all, there is a very big difference in functionality between a Sendmail dæmon and a PostgreSQL dæmon.
In the end, though, I did find one solution that worked very well for our needs. It involves setting up a disk mirror between machines using the software DRBD and a high-availability monitor on those machines using Heartbeat.
DRBD mirrors a partition between two machines allowing only one of them to mount it at a time. Heartbeat then monitors the machines, and if it detects that one of the machines has died, it takes control by mounting the mirrored disk and starting all the services the other machine is running.
I've had this setup running for about three years now, and it has made the inevitable hardware failures unnoticeable to the company.
In this tutorial, I show you how to set up a redundant Sendmail system, because once you do that, you will be able to set up almost any service you need. We assume that your master server is called server1 and has an IP address of 192.168.1.1, and your slave server is called server2 and has an IP address of 192.168.1.2.
And, because you don't want to have to access your mail server on any of these addresses in case they are down, we will give it a virtual address of 192.168.1.5. You can, of course, change this to whatever address you want in the Heartbeat configuration that I discuss near the end of this article.
This high-availability solution works by replicating a disk partition in a master/slave mode. The server that is running as a master has full read/write access to that partition; whereas the server running as slave has absolutely no access to the partition but silently replicates all changes made by the master server.
Because of this, all the processes that need to access the replicated partition must be running on the master server. If the master server fails, the Heartbeat dæmon running on the slave server will tell DRBD that it is now the master, mount the replicated partition, and then start all the processes that have data stored on the replicated partition.
The first step for running a redundant system is having two machines ready to try it out. They don't need to have identical specs, but they should meet the following requirements:
Enough free space on both machines to create an equal-sized partition on each of them.
The same versions of the dæmons you want to run across both machines.
A network card with crossover cable or a hub/switch.
An optional serial port and serial port crossover cable for additional monitoring.
You also should think carefully about which services you want running on both machines, as this will affect the amount of hard disk you will need to dedicate to replication across them and how you will store the configuration and data files of these services.
It's very important that you have enough space on this shared partition, because it will be the main data storage location for all of these services. So, if you are going to be storing a large Sendmail spool or a database, you should make sure it has more than enough space to run for a long time before having to repartition and reconfigure DRBD for a larger disk size.
Once you've made sure your machines are ready, you can go ahead and create an equal-sized partition on both machines. At this stage, you do not need to create a filesystem on that partition, because you will do that only once it is running mirrored over DRBD.
For my servers, I have one DRBD replicated drive that looks like this on my partition tables:
/dev/sda5 7916 8853 7534453+ 83 Linux
Note: type fdisk -l at your command prompt to view a listing of your partitions in a format similar to that shown here. Also, in my case, the partition table is identical on both redundant machines.
The next step after partitioning is getting the packages for Heartbeat version 1.2+ and DRBD version 0.8+ installed and the DRBD kernel module compiled. If you can get these prepackaged for your distribution, it will probably be easier, but if not, you can download them from www.linux-ha.org/DownloadSoftware and www.drbd.org/download.html.
Now, go to your /etc/hosts file and add a couple lines, one for your primary and another for your secondary redundant server. Call one server1, the other server2, and finally, call one mail, and set the IP addresses appropriately. It should look something like this:
192.168.1.1 server1 192.168.1.2 server2 192.168.1.5 mail
Finally, on both your master and slave server, make a folder called /replicated, and add the following line to the /etc/fstab file:
/dev/drbd0 /replicated ext3 noauto 0 0
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- RSS Feeds
- What's the tweeting protocol?
- New Products
- Trying to Tame the Tablet
- Readers' Choice Awards
- IT industry leaders
2 hours 3 min ago - Reply to comment | Linux Journal
18 hours 51 min ago - Reply to comment | Linux Journal
21 hours 24 min ago - Reply to comment | Linux Journal
22 hours 41 min ago - great post
23 hours 16 min ago - Google Docs
23 hours 38 min ago - Reply to comment | Linux Journal
1 day 4 hours ago - Reply to comment | Linux Journal
1 day 5 hours ago - Web Hosting IQ
1 day 6 hours ago - Thanks for taking the time to
1 day 8 hours ago






Comments
Thanks
Thank You, check http://docs.homelinux.org for other tutorials about drbd. It's also good explained like Your articel.
DRBD.conf
Hye.!
IP of ur server1 is 192.168.1.1 instead of 192.168.1.3
is'nt it .
is it (192.168.1.3) taken here by some mistake or something else...
;-}
drbd after failure
I have done the above and setup drbd and heartbeat. I am having an issue where once say node-a looses it's network connection the failover happens as expected, node-b mounts the disk and everything is there; but when node-a comes backup it drbd is not being started as a slave and it is taking back over as primary and I loose all my new data that node-b created.
how can I get drbd to play nicer?
====DRBD.conf=====
global {
minor-count 1;
}
resource mysql {
# * for critical transactional data.
protocol C;
on server-1 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.128:7788;
meta-disk internal;
}
on server-2 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.129:7788;
meta-disk internal;
}
disk {
on-io-error detach;
}
net {
max-buffers 2048;
ko-count 4;
}
syncer {
rate 10M;
al-extents 257;
}
startup {
wfc-timeout 30;
degr-wfc-timeout 120;
}
}
=====END======
====ha.cf=====
logfacility local0
keepalive 500ms
deadtime 10
warntime 5
initdead 30
ucast eth0 192.168.0.129
#mcast eth0 225.0.0.1 694 2 0
auto_failback off
node server-1
node server-2
respawn hacluster /usr/lib/heartbeat/ipfail
use_logd yes
logfile /var/log/hb.log
debugfile /var/log/heartbeat-debug.log
====END=======
Is the an error in Listing 1 (drbd.conf)?
I'm in the midst of setting up my first HA cluster and want to be sure I didn't miss something. Shouldn't the address line in the on server section be 192.168.1.1 rather than .1.3? If not, what did I miss?
TIA,
Laker
mounting drbd drives with heartbeat
I can get drbd drive to startup with heartbeat, I can fail and have the primary and secondary change. The problem I am having is that I can not get heartbeat to mount the drives, I can mount them just fine with the mount command.
I am not sure in the article how the drives are being mounted.
Does anyone know how I would mount the /dev/drbd0 to /mnt/drbd0 with heartbeat?
mount drives with heartbeat
I have the same issue!
Did you find a way out? My HA-Cluster with DRBD works great if I manually mount my replicated drive. But it refuses to do automatically with heartbeat.
Any help/leads will be appreciated.
Thanks
Try mounting with resource
Try mounting with resource script Filesystem. This works for me and is mentioned in some other how to articles...
Filesystem::/dev/drdb0::/data::ext3
Try on commandline w/o the ::
NFS support is generally quite tricky
That should be at least cited in the article. Heartbeat based service with a floating IP address could be extremely tricky for locking, when used with NFS servers. Also, I would avoid to use mbox based mail spool dirs on drdb partitions, maildirs are much more safe. That's my 2 cents.
NFS support not that difficult
NFS works well - including locking. Dozens to hundreds of sites have it working quite nicely. You do have to set things up correctly, but the Linux-HA web site and several other articles (pointed to by the PressRoom page) explain how to do that in detail. There is a hole where an extremely active application can possibly fail to get a lock during a failover, but it's happens rarely.
However, you do have to set it up correctly.