Increase Performance, Reliability and Capacity with Software RAID
Replace /dev/sda1 with /dev/md0.
Replace /dev/sda2 with /dev/md1.
Make sure GRUB is installed properly on both disks, and reboot the server:
# grub > device (hd0) /dev/sda > root (hd0,0) > setup (hd0) > device (hd0) /dev/sdb > root (hd0,0) > setup (hd0) > quit # reboot
When your server comes back on-line, it will be running on a RAID 1 array with only one drive in the array. To complete the process, you need to repartition the first drive, add it to the array, and make a few changes to GRUB. Make sure your server is functioning normally and all your data is intact before proceeding. If not, you run the risk of losing data when you repartition your disk or rebuild the array.
Use sfdisk to repartition the first drive to match the second drive. The --no-reread option is needed; otherwise, sfdisk will complain about not being able to reload the partition table and fail to run:
# sfdisk -d /dev/sdb | sfdisk /dev/sda --no-reread
Now that your first drive has the correct partition types, add it to both arrays. The arrays will start the rebuild process automatically, which you can monitor with /proc/mdstat:
# mdadm /dev/md0 -a /dev/sda1 # mdadm /dev/md1 -a /dev/sda2 # watch cat /proc/mdstat
Once the arrays have completed rebuilding, you safely can reconfigure GRUB to boot from the first drive. Although it is not required, you can reboot to make sure your server still will boot from the first drive:
# vim /boot/grub/menu.lst
Next, copy first kernel entry and change (hd1,0) to (hd0,0). Then:
That completes the process. Your server should be running on a RAID 1 array protecting your data from a drive failure.
As you can see, Linux software RAID is very flexible and easy to use. It can protect your data, increase server performance and provide additional capacity for storing data. Software RAID is a high-performance, low-cost alternative to hardware RAID and is a viable option for both desktops and servers.
RAID is extremely versatile and uses a variety of techniques to increase capacity, performance and reliability as well as reduce cost. Unfortunately, you can't quite have all those at once, which is why there are many different implementations of RAID. Each implementation, or level, is designed for different needs.
Mirroring (RAID 1):
Mirroring creates an identical copy of the data on each disk in the array. The array has the capacity of only a single disk, but the array will remain functional as long as at least one drive is still good. Read and write performance is the same or slightly slower than a single drive. However, read performance will increase if there are multiple read requests at the same time, because each drive in the array can handle a read request (parallel reads). Mirroring offers the highest level of reliability.
Striping (RAID 0):
Striping spreads the data evenly over all disks. This makes reads and writes very fast, because it uses all the disks at the same time, and the capacity of the array is equal to the total capacity of all the drives in the array. However, striping does not offer any redundancy, and the array will fail if it loses a single drive. Striping provides the best performance and capacity.
Striping with Parity (RAID 4, 5, 6):
In addition to striping, these levels add information that is used to restore any missing data after a drive failure. The additional information is called parity, and it is computed from the actual data. RAID 4 stores all the parity on a single disk; RAID 5 stripes the parity across all the disks, and RAID 6 stripes two different types of parity. Each copy of parity uses one disk worth of capacity in the array, so RAID 4 and 5 have the capacity of N-1 drives, and RAID 6 has the capacity of N-2 drives. RAID 4 and 5 require at least three drives and can lose any single disk in the array; whereas RAID 6 requires at least four drives and can withstand losing up to two disks. Write performance is relatively slow because of the parity calculations, but read performance is usually very fast as the data is striped across all of the drives. These RAID levels provide a nice blend of capacity, reliability and performance.
Nesting RAID Levels (RAID 10):
It is possible to create an array where the “drives” actually are other arrays. One common example is RAID 10, which is a RAID 0 array created from multiple RAID 1 arrays. Like RAID 5, it offers a nice blend of capacity, reliability and performance. However, RAID 10 gives up some capacity for increased performance and reliability, and it requires a minimum of four drives. The capacity of RAID 10 is half the total capacity of all the drives, because it uses a mirror for each “drive” in the stripe. It offers more reliability, because you can lose up to half the disks as long as you don't lose both drives in one of the mirrors. Read performance is excellent, because it uses striping and parallel reads. Write performance also is very good, because it uses striping and does not have to calculate parity like RAID 5. This RAID level typically is used for database servers because of its performance and reliability.
MD RAID 10:
This is a special RAID implementation that is unique to Linux software RAID. It combines striping and mirroring and is very similar to RAID 10 with regard to capacity, performance and reliability. However, it will work with two or more drives (including odd numbers of drives) and is managed as a single array. In addition, it has a mode (raid10,f2) that offers RAID 0 performance for reads and very fast random writes.
Spanning (Linear or JBOD):
Although spanning is not technically RAID, it is available on nearly every RAID card and software RAID implementation. Spanning appends disks together, so the data fills up one disk then moves on to the next. It does not increase reliability or performance, but it does increase capacity, and it typically is used to create a larger drive out of a bunch of smaller drives of varying sizes.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Managing Linux Using Puppet
- Linux in Government: GNU/Linux Clears Procurement Hurdles
- Tech Tip: Really Simple HTTP Server with Python
- Google's SwiftShader Released
- SUSE LLC's SUSE Manager
- Linux In Government: Interoperability
- Linux in Government: Winning in the Big Enterprise Space
- Linux in Government: Open Source Innovation within the DoD
- Linux in Government: An Interview with John Weathersby of OSSI
- Convert Filenames to Lowercase
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide