RAID-1, Part 1

What it is, when to use it, how to make a RAID-1 device for an ext2 partition.

Part 1 of this two-part series describes RAID, in which cases RAID-1 is useful, the RAID-1 installation requirements and how to install RAID-1 when you have an existing ext2 filesystem. Part 2 covers how to set up RAID-1 with an existing and a new swap partition, how to boot from a RAID-1 device, how to use a RAID-1 array to facilitate backing up a busy filesystem or a database that cannot be taken off line for long and how to set up monitoring scripts that will notify you of problems.

We wrote this article because we could not find a complete description of the setup process, and we wanted to document what we learned to make it easier for others to implement RAID-1. We learned how to implement RAID by reviewing the "Software-RAID HOWTO" and Usenet correspondence as well as through trial and error. This article includes information from the HOWTO and Usenet archives. Please read the HOWTO and the other resources called out in the references section for a lot more information about RAID.

This article focuses on RAID-1. There are five RAID levels: Linear mode, RAID-0, RAID-1, RAID-4 and RAID-5. RAID-1 maintains an exact mirror of the data on one disk on another disk. If one of the active RAID disks is removed (or fails), the data are still intact. If there are spare disks available, and if the system survived the crash, reconstruction of the mirror will begin immediately on one of the spare disks. If there is no spare disk, the system will continue to run on the remaining good disk, until you can obtain and install a replacement disk. RAID-1 is an effective, inexpensive way to help to ensure that your system stays up when you have a hard disk failure. One also could use a RAID-1 device to facilitate backing up a busy filesystem.

A RAID-1 device (e.g., /dev/md0) maintains an exact copy (mirror) of the files in a given partition (e.g. /dev/hda2) on a separate partition (e.g./dev/hdc2). The Linux RAID code mirrors partitions, not entire disks. The partitions that make up a RAID device set should be on separate hard disks. Write performance is slightly worse than on a single device, because identical copies of the data written must be sent to every disk in the array. The write is not complete until all disk writes are finished. Reading may be faster than without RAID-1, depending on the read-balancing strategy that is implemented. We did not benchmark our RAID setups. We are using RAID to provide data redundancy, not to improve disk performance.

We suffered a hard disk failure on our production web server a few months after we installed RAID-1 on the system. We noticed that a hard disk partition failed during a routine review of our system logs (the RAID code is truly transparent, we didn't notice the failure until three days after it happened). If we were not using RAID-1, we would have found out when the system crashed. We have about twelve staff members who work on the server and hundreds of users who access our web site on a daily basis. They would have lost time had server gone down. We replaced the failed disk during scheduled downtime. A second (identical) disk failed a few weeks later with similar results. This was a much better and less stressful outcome for all concerned. We have since set up scripts that monitor the status of the RAID devices and send e-mail alerts when there are problems.

There are three requirements for RAID-1: the kernel must support RAID-1, the RAID device driver must be compiled into the kernel or be available as a module and raidtools must be installed. Technically, you can use RAID with just one hard disk but most installations use more than one disk.

If /proc/mdstat exists, RAID support was compiled into the kernel. The Personalities listing indicates which RAID devices are available. For example:

     more /proc/mdstat
     Personalities : [RAID1]
     read_ahead not set
     unused devices: <none>

indicates that the kernel supports RAID and the RAID-1 device driver is loaded (RAID-1 may have been compiled into the kernel or have been loaded as a module).

We've set up RAID-1 arrays on Red Hat 7.0, 7.1 and 7.2 and Debian potato using the 2.4.4, 2.4.9, 2.4.12 and 2.4.17 kernels that include RAID support, and the 2.2.16-22 kernel with the md driver 0.90.0 patch. If you are using a kernel that does not include RAID support, you may use the RAID-patches and raidtools found at people.redhat.com/mingo.

If you're using IDE for your RAID devices, you should install them on different controllers. SCSI RAID setups can get away with using the same bus, but they have a greater chance of a broken disk taking down the whole machine. And, consider using different manufacturers (or at least different lots) for the disks in your RAID array. Our initial array was created with two identical hard disks. The disks failed within a month of each other.

The partition that will be used for the RAID-1 array on the second disk should be about the same size as that on the first disk. It must be at least as large as the first disk's partition. If the second disk's partition is larger, the extra space will not be used by the RAID-1 device. The smallest partition determines the size of the RAID-1 device.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

would be nice to also know how to uninstall

Anonymous's picture

how to take a raid-1 system and return it to two separate disks?

how to unbuild what one has built?

would be nice to see this information, as well. (it's also missing from the linux software-raid howto and other similar articles online)

Useless Mirror

Anonymous's picture

I echo the last comment. What's the use of RAID if the system still fails? I have done multiple Fedora installs attempting to RAID1 entire system. All installs go well, system runs great. When a disk is removed system hangs. When you re-boot and add the disk back it rebuilds and works fine again. It seems some pertinent information is missing from all these articles and how to's on linux RAID.

Useless. How do we mirror

Anonymous's picture

Useless. How do we mirror root then? Copy it to a temp location, unmount it and format???? In Solaris, metadevices can be created on the fly.

Chunk sizes, blocks sizes, groups?

warp9pnt9's picture

Where is the discussion on the performance effect of raid chunk size, and ext2 (ext3?) filesystem properties?

RAID clarification

Srikumar's picture

Hi,
I have seen ur article
I need some clarification about software raid 1
I have installed Redhat Linux 9 using Software RAID 1
I am using only two..not giving any spares
But if one hard disk fail, i will create another hard disk with the same partions
can i concatinate the two partions as a single RAID with out formatting
And i m getting problem when i install some network drivers.
The drivers are not installing
Where might be the problem????
reply
Regards,
Srikumar

Try mdadm !

K M Ashraf's picture

Try the new RAID for Linux management tool 'mdadm'

http://www.linuxdevcenter.com/pub/a/linux/2002/12/05/RAID.html

Re: RAID-1, Part 1

Anonymous's picture

Good info I guess. LD

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState