Increase Performance, Reliability and Capacity with Software RAID
Replace /dev/sda1 with /dev/md0.
Replace /dev/sda2 with /dev/md1.
Make sure GRUB is installed properly on both disks, and reboot the server:
# grub > device (hd0) /dev/sda > root (hd0,0) > setup (hd0) > device (hd0) /dev/sdb > root (hd0,0) > setup (hd0) > quit # reboot
When your server comes back on-line, it will be running on a RAID 1 array with only one drive in the array. To complete the process, you need to repartition the first drive, add it to the array, and make a few changes to GRUB. Make sure your server is functioning normally and all your data is intact before proceeding. If not, you run the risk of losing data when you repartition your disk or rebuild the array.
Use sfdisk to repartition the first drive to match the second drive. The --no-reread option is needed; otherwise, sfdisk will complain about not being able to reload the partition table and fail to run:
# sfdisk -d /dev/sdb | sfdisk /dev/sda --no-reread
Now that your first drive has the correct partition types, add it to both arrays. The arrays will start the rebuild process automatically, which you can monitor with /proc/mdstat:
# mdadm /dev/md0 -a /dev/sda1 # mdadm /dev/md1 -a /dev/sda2 # watch cat /proc/mdstat
Once the arrays have completed rebuilding, you safely can reconfigure GRUB to boot from the first drive. Although it is not required, you can reboot to make sure your server still will boot from the first drive:
# vim /boot/grub/menu.lst
Next, copy first kernel entry and change (hd1,0) to (hd0,0). Then:
That completes the process. Your server should be running on a RAID 1 array protecting your data from a drive failure.
As you can see, Linux software RAID is very flexible and easy to use. It can protect your data, increase server performance and provide additional capacity for storing data. Software RAID is a high-performance, low-cost alternative to hardware RAID and is a viable option for both desktops and servers.
RAID is extremely versatile and uses a variety of techniques to increase capacity, performance and reliability as well as reduce cost. Unfortunately, you can't quite have all those at once, which is why there are many different implementations of RAID. Each implementation, or level, is designed for different needs.
Mirroring (RAID 1):
Mirroring creates an identical copy of the data on each disk in the array. The array has the capacity of only a single disk, but the array will remain functional as long as at least one drive is still good. Read and write performance is the same or slightly slower than a single drive. However, read performance will increase if there are multiple read requests at the same time, because each drive in the array can handle a read request (parallel reads). Mirroring offers the highest level of reliability.
Striping (RAID 0):
Striping spreads the data evenly over all disks. This makes reads and writes very fast, because it uses all the disks at the same time, and the capacity of the array is equal to the total capacity of all the drives in the array. However, striping does not offer any redundancy, and the array will fail if it loses a single drive. Striping provides the best performance and capacity.
Striping with Parity (RAID 4, 5, 6):
In addition to striping, these levels add information that is used to restore any missing data after a drive failure. The additional information is called parity, and it is computed from the actual data. RAID 4 stores all the parity on a single disk; RAID 5 stripes the parity across all the disks, and RAID 6 stripes two different types of parity. Each copy of parity uses one disk worth of capacity in the array, so RAID 4 and 5 have the capacity of N-1 drives, and RAID 6 has the capacity of N-2 drives. RAID 4 and 5 require at least three drives and can lose any single disk in the array; whereas RAID 6 requires at least four drives and can withstand losing up to two disks. Write performance is relatively slow because of the parity calculations, but read performance is usually very fast as the data is striped across all of the drives. These RAID levels provide a nice blend of capacity, reliability and performance.
Nesting RAID Levels (RAID 10):
It is possible to create an array where the “drives” actually are other arrays. One common example is RAID 10, which is a RAID 0 array created from multiple RAID 1 arrays. Like RAID 5, it offers a nice blend of capacity, reliability and performance. However, RAID 10 gives up some capacity for increased performance and reliability, and it requires a minimum of four drives. The capacity of RAID 10 is half the total capacity of all the drives, because it uses a mirror for each “drive” in the stripe. It offers more reliability, because you can lose up to half the disks as long as you don't lose both drives in one of the mirrors. Read performance is excellent, because it uses striping and parallel reads. Write performance also is very good, because it uses striping and does not have to calculate parity like RAID 5. This RAID level typically is used for database servers because of its performance and reliability.
MD RAID 10:
This is a special RAID implementation that is unique to Linux software RAID. It combines striping and mirroring and is very similar to RAID 10 with regard to capacity, performance and reliability. However, it will work with two or more drives (including odd numbers of drives) and is managed as a single array. In addition, it has a mode (raid10,f2) that offers RAID 0 performance for reads and very fast random writes.
Spanning (Linear or JBOD):
Although spanning is not technically RAID, it is available on nearly every RAID card and software RAID implementation. Spanning appends disks together, so the data fills up one disk then moves on to the next. It does not increase reliability or performance, but it does increase capacity, and it typically is used to create a larger drive out of a bunch of smaller drives of varying sizes.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
- Validate an E-Mail Address with PHP, the Right Way
- New Products
- Weechat, Irssi's Little Brother
- Tech Tip: Really Simple HTTP Server with Python
- Poul-Henning Kamp: welcome to
54 min 25 sec ago
- This has already been done
55 min 25 sec ago
- Reply to comment | Linux Journal
1 hour 40 min ago
- Welcome to 1998
2 hours 29 min ago
- notifier shortcomings
2 hours 52 min ago
4 hours 29 min ago
- Android User
4 hours 31 min ago
- Reply to comment | Linux Journal
6 hours 24 min ago
9 hours 13 min ago
- This is a good post. This
14 hours 26 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?