RAID0 Implementation Under Linux

by Jay Munsterman

Most of us who use Linux at home don't have the same requirements as businesses that consider Linux a cost-effective, open alternative to expensive and proprietary Unices. Usually RAID devices aren't a requirement of the home user, although many users running a striped swap partition report a big improvement in speed. The multiple device (md) driver, written by Marc Zyngier, brings RAID to Linux.

md is a driver (included in the standard kernel distribution since 1.3.69) that allows you to group a number of disk partitions together so that they act as a single block device. md differs from the other drivers because it doesn't truly access the physical devices that compose it. md redirects requests from the upper layer to the devices involved and is interface independent, allowing IDE, SCSI and XT disks to be grouped as a single device.

There are three modes that md can use with its devices: linear, RAID0 and RAID1. In linear mode, the physical devices are appended to each other. When the first device reaches capacity, data is sent to the next device in the group. This mode allows for the creation of a device with a greater capacity but offers no real improvement in performance. RAID0 (or striped) devices spread the data evenly across all the devices in the group. Each write is broken into “chunks”, and the chunks are placed sequentially across the physical devices. RAID0 offers performance improvements, especially with concurrent disk access. RAID1 adds mirroring to RAID0. I feel that RAID0 is the most important of these modes; therefore, it is the focus for the remainder of this article.

Preparation

When planning your RAID0 implementation, there are two considerations to keep in mind: physical device layout and device size. If you use partitions on the same physical device, you will not see any real benefit. The best recommendation I can make is to use several SCSI disks with each partition having the same number of blocks. This seems to offer the best performance. md can deal with different size devices as long as there is a significant difference. Using a 1,000,000 block device and a 1,000,001 block device can lead to problems. If you were to create an md device with a 500MB, a 1000MB and a 1500MB partition, it would run fine; md would split the device into “stripe zones” of 500MB. Once 1500MB was written to the device, the first physical device would be full. The second stripe would then be used on the second and third device. After another 1000MB is written, all data would be placed on the last device. Performance decreases in this arrangement as disk usage increases.

Once you have set up the partitions to be used, the kernel will have to be recompiled with md support enabled. Run make config (menuconfig or xconfig) and select “Multiple Device Support” and either “Linear” or “RAID0” mode. Compile as usual. While rebooting with the new kernel, you should receive a message like this:

md driver 0.35 MAX_MD_DEV=4, MAX_REAL=8
raid0 personality registered

If it went by too fast and you think you may have missed it, use the following command:

dmesg | more
to receive a replay of the messages logged at boot time. The messages show that md version 0.35 is installed with support for up to four devices, each being made up of up to eight physical devices with RAID0 support. If you think you either will need more than md0 to md3 or will be using more than eight physical devices in an md, the md.h file must be edited prior to compilation; it is usually located in /usr/src/linux/include/linux. Change the value defined for MAX_REAL or MAX_MD_DEV to fit your requirements.

You now have md support in your kernel, or as a loadable module if you went that way. Next you need to obtain the tools to manage your md devices. Although md is supported in the kernel, it appears that most distributions don't include the tools. They are available from ftp://sweet-smoke.ufr-info-p7.ibp.fr/public/Linux or from the mirror in the U.S. at ftp://linux.nrao.edu/pub/linux/packages/MD-driver. Red Hat software has an RPM distribution available at ftp://ftp.redhat.com/pub/contrib/RPMS. The file md-035-3.i386.rpm contains the needed binaries. Once you have downloaded and unpacked the source, become root and run make install. The compilation is straightforward, and I've never had a problem with it. If your Linux source code tree is not located in /usr/src/linux, you will need to edit the Makefile; otherwise, it should compile out of the box.

Creating an MD Device

Now you're ready to actually create a RAID0 device. The compilation created several tools for the task: mdadd, mdrun and mdstop. mdadd is used to add block devices to an md device. If you want to use sda1, sdb1 and sdc1, you issue the command:

/sbin/mdadd /dev/md0 /dev/sda1 /dev/sdb1 \
        /dev/sdc1

This command adds sda1, sdb1 and sdc1 to md0. This same result can also be accomplished by giving these commands:

/sbin/mdadd /dev/md0 /dev/sda1
/sbin/mdadd /dev/md0 /dev/sdb1
/sbin/mdadd /dev/md0 /dev/sdc1
Remember that the order in which the devices are added is significant. If you change the order, any data previously written will be lost. I recommend adding the devices in what seems like a logical order and then sticking to it.

Now we must start the device. mdrun has the following command syntax:

/sbin/mdrun -p

where x indicates the mode: -l for linear, 0 for RAID0 and 1 for RAID1. To start the device we just made, the command would be:

/sbin/mdrun -p0 /dev/md0
When using RAID devices, another option you can use is -cnk to specify chunk size, where n is the chunk size in KB (n must be a power of two). For example, -c6k indicates a 6KB chunk size. The default value is the value of your PAGE_SIZE. The best value for chunk size would be the average request size, so chances are two requests will write to different physical disks. If you plan to use the md for swap space, stick with the default.

Once the device is running, you can create a file system and mount it. For example:

/sbin/mkfs.ext2 /dev/md0
mount /dev/md0 /var/spool/news

This will create an ext2 file system and then mount it as the news spool. Your RAID0 device is now ready for data. To check its status, type:

cat /proc/mdstat
and receive the following output:
Personalities : [2 raid0]
read_ahead 120 sectors
md0 : active raid0 sda1 sdb1 sdc1 168588 blocks 4k chunks
md1 : inactive
md2 : inactive
md3 : inactive
This report tells you which modes are supported, the current read_ahead value, the state of each md device, its mode, physical parts, total size and chunk size.
Managing Your MD Device

At this point we have our RAID device running and mounted; as soon as the machine is rebooted, we will have to rerun mdadd, mdrun and mount. All of this can easily be added to your rc.local file, but there is a better way. mdcreate automatically creates an /etc/mdtab file. The mdtab file serves a function similar to the /etc/fstab file, informing the system of the component devices, modes and mount points. The syntax is:

mdcreate [-cxk] mode md_dev dev0 dev1 ...

To create an mdtab file for our example device we would use:

/sbin/mdcreate raid0 /dev/md0 /dev/sda1\
        /dev/sdb1 /dev/sdc1
cat /etc/mdtab
# mdtab entry for /dev/md0:
# /dev/md0  raid0,4k,0,fe8a9ffb  /dev/sda1 /dev/sdb1 /dev/sdc1
With this file in place, we can reduce the mdadd command to mdadd -a or mdadd -ar to automatically add the devices and run them. This also ensures that the devices will always be added in the correct order.

If there is ever a need to stop the device, first unmount it and then use mdstop. mdstop will free the physical devices and flush the buffers. For our example device, we would first stop the news server if it was running with the command:

/sbin/mdstop /dev/md0

Then, we could unmount it using:

umount /var/spool/news
md0 is now inactive, and the physical partitions can be used elsewhere. Remember, if the device is stopped, none of the data that was written to the md device is accessible.

With md, the implementation and management of RAID devices is made easy. As development continues, we will see RAID1 and the tools necessary for mirror management and recovery. To stay current on the development process, join the Linux-raid mailing list. To subscribe send an email to Majordomo@vger.rutgers.edu with a one line body that says:

subscribe linux-raid <

Be sure to look at the documentation that comes with the md package. It's tools like this one that are helping Linux find a place in the business world.

MD at Work

Jay Munsterman has just relocated to Atlanta, GA from Washington DC, where he works with a variety of Unix platforms, Linux being his favorite. In his spare time he likes to spend time with his soon-to-be wife, Denessa, and their dog Melman. Jay can be reached at jmunster@mindspring.com.

Load Disqus comments

Firstwave Cloud