High Availability Linux with Software RAID
Creating a bootable CD recovery disk can be done easily with the mkbootdisk utility. In order to include the /boot partition on the recovery CD, however, a small patch needs to be applied to mkbootdisk (Listing 1). Also, you must have the mkisofs package installed. The following commands, issued as root, take of this:
cd /sbin cp mkbootdisk mkbootdisk.orig patch -p0 \ mkbootdisk.patch
After the patch is applied, the following command creates the bootable recovery CD:
cd /tmp mkbootdisk --device bootcd.iso --iso 2.4.18-14
When using the --iso option, the specified --device is expected to be a filename to which an ISO image will be written. The last parameter, 2.4.18-14, specifies which kernel to use.
We can check the ISO image by using the following commands:
cd /tmp losetup /dev/loop1 bootcd.iso mount /dev/loop1 /mnt
They create a loopback device on which the ISO image is then mounted. Upon inspection, you should see the complete /boot directory on the CD image.
For a physical machine, this image would be burned onto a CD. For the purposes of testing, VMware can use an ISO image directly as a virtual CD-ROM drive.
Now for the fun part. One of the advantages of using VMware for testing is the ability to fail hardware without having to worry about possible repercussions to physical hardware. In order to ensure that the system behaves as expected, I ran two failure tests: failing a pure RAID drive and failing a mixed native and RAID drive.
To fail a drive under VMware, I simply shut down the VM, move the files representing a particular virtual drive to a backup folder and re-create a fresh virtual drive. This process effectively creates a fresh unpartitioned drive--exactly what the situation would be if a drive had failed and been replaced.
For the first test, I "failed" the fourth drive in the array. After a successful boot in the VM, I looked at /proc/mdstat:
Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdf1 sde1 sdd1 sdc1 sdb1 1027584 blocks level 5, 64k chunk, algorithm 0 [5/4] [UU_UU] md1: active raid5 sdf2 sde2 sdd2 sdc2 sdb2 sda2 44780800 blocks level5, 64k chunk, algorithm 0 [6/5] [UUU_UU]
It is a little counterintuitive, but the status is indicated starting with the lower drive numbers from left to right. So, for md0, [UU_UU] indicates that drives 0 and 1 are up, drive 2 is down and drives 3 and 4 are up. These correlate to sdb1, sdc1, sdd1, sde1 and sdf1, respectively. For md1, [UUU_UU] indicates that drives 0 through 2 are up, drive 3 is down and drives 4 and 5 are up. These correlate to sda2, sdb2, sdc2, sdd2, sde2 and sdf2, respectively.
As we would expect, the sdd drive has failed. At this point the RAID is running in degraded mode. If another drive were to fail, there would be data loss.
We can reintegrate the "new" drive into the array while the system is running. To do this, we need to partition the drive and use the raidhotadd utility. The drive should be partitioned exactly as it was originally. For this drive, both partitions are of type Linux raid autodetect (fd). After the drive is repartitioned, execute the following commands:
raidhotadd /dev/md0 /dev/sdd1 raidhotadd /dev/md1 /dev/sdd2 cat /proc/mdstat
After which, you should see something like the following output:
Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sdf1 sde1 sdd1 sdc1 sdb1 1027584 blocks level 5, 64k chunk, algorithm 0 [5/4] [UU_UU] [===>.................] recovery = 18.3% (47816/256896) finish=0.5min speed=6830K/sec md1: active raid5 sdf2 sde2 sdd2 sdc2 sdb2 sda2 44780800 blocks level5, 64k chunk, algorithm 0 [6/5] [UUU_UU]
When the sync process is finished for md0, a similar process begins for md1. When completed, you should see that /proc/mdstat appears as it did earlier (with all the Us present) and that the array is no longer in degraded mode.
For the second test, I "failed" the first drive in the array. For this test, we must have the bootable CD-ROM created earlier. It either can be burned onto a CD or the file can be referenced in the VMware configuration (Figure 4).
When you boot off the CD, the welcome screen created by the mkbootdisk script appears (Figure 5). The boot fails part way through when the system attempts to mount the /boot partition. This is because the drive /dev/sda1 is not available. Enter the root password to get to maintenance mode, and then edit the filesystem table file using the command vi /etc/fstab. For now, simply comment out the line that contains the /boot entry. On my installation, the fstab file had a label reference for the /boot entry. I prefer to reference the drive directly, so I changed this entry to /dev/sda1 and then commented it out. Type exit and the system reboots, again booting off the CD. This time, it is able to start up completely.
You should notice that the md1 RAID volume is running in degraded mode by inspecting /proc/mdstat, as before. The tasks to restore the failed first drive are as follows:
Partition the drive.
Use the raidhotadd utility to rebuild the md1 RAID.
Format the native partition on the drive.
Copy the /boot files from the CD to the drive.
Uncomment the /etc/fstab file.
Install the GRUB boot loader in the MBR (master boot record) of the drive.
The drive should be partitioned exactly as it was originally. That is, the first 250MB partition should be type Linux (83), and the second 8750MB partition should be type Linux raid autodetect (fd). You can then enter the command:
raidhotadd /dev/md1 /dev/sda2
to rebuild the md1 RAID. Inspect /proc/mdstat as before to check on the status of the synchronization process.
The native partition should be formatted with the command mke2fs /dev/sda1. Assuming that the CD-ROM drive is mounted on /mnt/cdrom, the following commands restore the /boot partition:
mount /dev/sda1 /boot cp -p -r /mnt/cdrom/boot/* /boot
Next, edit the /etc/fstab, and uncomment the line containing the /boot partition. Finally, use GRUB to install the boot loader on the drive's MBR. A thorough discussion of GRUB is outside the scope of this article, but the following commands use the original GRUB configuration defined when Red Hat 8.0 was installed:
grub root (hd0,0) setup (hd0) quit
Once the md1 RAID is rebuilt, the system is ready to be rebooted without the recovery CD. Make sure the recovery CD is removed from the CD-ROM drive or that the image reference in the VMware configuration is removed, and reboot. The system should come up normally. A look at /proc/mdstat should show both RAID volumes, with all members up and running.