Increase Performance, Reliability and Capacity with Software RAID
In the late 1980s, processing power and memory performance were increasing by more than 40% each year. However, due to mechanical limitations, hard drive performance was not able to keep up. To prepare for a “pending I/O crisis”, some researchers at Berkeley proposed a solution called “Redundant Arrays of Inexpensive Disks”. The basic idea was to combine several drives so they appear as one larger, faster and/or more-reliable drive. RAID was, and still is, an effective way for working around the limitations of individual disk drives. Although RAID is typically implemented using hardware, Linux software RAID has become an excellent alternative. It does not require any expensive hardware or special drivers, and its performance is on par with high-end RAID controllers. Software RAID will work with any block device and supports nearly all levels of RAID, including its own unique RAID level (see the RAID Levels sidebar).
Most Linux distributions have built-in support for software RAID. This article uses the server edition of Ubuntu 8.04 (Hardy). Run the following commands as root to install the software RAID management tool (mdadm) and load the RAID kernel module:
# apt-get install mdadm # cat /proc/mdstat
Once you create an array, /proc/mdstat will show you many details about your software RAID configuration. Right now, you just want to make sure it exists to confirm that everything is working.
Many people like to add a couple drives to their computer for extra file storage, and mirroring (RAID 1) is an excellent way to protect that data. Here, you are going to create a RAID 1 array using two additional disks, /dev/sda and /dev/sdb.
Before you can create your first RAID array, you need to partition your disks. Use fdisk to create one partition on /dev/sda, and set its type to “Linux RAID autodetect”. If you are just testing RAID, you can create a smaller partition, so the creation process does not take as long:
# fdisk /dev/sda > n > p > 1 > <RETURN> > <RETURN> > t > fd > w
Now, you need to create an identical partition on /dev/sdb. You could create the partition manually using fdisk, but it's easier to copy it using sfdisk. This is especially true if you are creating an array using more than two disks. Use sfdisk to copy the partition table from /dev/sda to /dev/sdb, then verify that the partition tables are identical:
# sfdisk -d /dev/sda | sfdisk /dev/sdb # fdisk -l
Now, you can use your newly created partitions (/dev/sda1 and /dev/sdb1) to create a RAID 1 array called /dev/md0. The md stands for multiple disks, and /dev/mdX is the standard naming convention for software RAID devices. Use the following command to create the /dev/md0 array from /dev/sda1 and /dev/sdb1:
# mdadm -C /dev/md0 -l1 -n2 /dev/sda1 /dev/sdb1
When an array is first created, it automatically will begin initializing (also known as rebuilding). In your case, that means making sure the two drives are in sync. This process takes a long time, and it varies based on the size and type of array you created. The /proc/mdstat file will show you the progress and provide an estimated time of completion. Use the following command to verify that the array was created and to monitor its progress:
# watch cat /proc/mdstat # ctrl-c to exit
It is safe to use the array while it's rebuilding, so go ahead and create the filesystem and mount the drive. Don't forget to add /dev/md0 to your /etc/fstab file, so the array will be mounted automatically when the system boots:
# mkfs.ext3 /dev/md0 # mkdir /mnt/md0 # mount /dev/md0 /mnt/md0 # echo "/dev/md0 /mnt/md0 ext3 defaults 0 2" >> /etc/fstab
Once the array is finished rebuilding, you need to add it to the mdadm configuration file. This will make it easier to manage the array in the future. Each time you create or modify an array, update the mdadm configuration file using the following command:
# mdadm --detail --scan >> /etc/mdadm/mdadm.conf # cat /etc/mdadm/mdadm.conf
That's it. You successfully created a two-disk RAID 1 array using software RAID.
The entire point of a RAID 1 array is to protect against a drive failure, so you are going to simulate a drive failure for /dev/sdb and rebuild the array. To do this, mark the drive as failed, and then remove it from the array. If the drive actually failed, the kernel automatically would mark the drive as failed. However, it is up to you to remove the disk from the array before replacing it. Run the following commands to fail and remove the drive:
# mdadm /dev/md0 -f /dev/sdb1 # cat /proc/mdstat # mdadm /dev/md0 -r /dev/sdb1 # cat /proc/mdstat
Notice how /dev/sdb is no longer part of the array, yet the array is functional and all your data is still there. It is safe to continue using the array as long as /dev/sda does not fail. You now are free to shut down the system and replace /dev/sdb when it's convenient. In this case, pretend you did just that. Now that your new drive is in the system, format it and add it to the array:
# sfdisk -d /dev/sda | sfdisk /dev/sdb # mdadm /dev/md0 -a /dev/sdb1 # watch cat /proc/mdstat
The array automatically will begin rebuilding itself, and /proc/mdstat should indicate how long that process will take.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- Managing Linux Using Puppet
- Control Your Linux Desktop with D-Bus
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Doing for User Space What We Did for Kernel Space
- SuperTuxKart 0.9.2 Released
- Google's SwiftShader Released
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide