Mirrordir Can Make You a Hometown Hero!
What is the least attractive task of a system administrator's job? For most of us, it's performing backups of our data. Something about backups leaves us yawning and wishing for something far more exciting to do. Even with the advent of ATLs and backup software automation, we still need to watch log files and test the backup media occasionally by restoring files to assure the data is intact. Yep, backups are a necessary pain--but what if we could supplement our work with an extra layer of protection that, when disaster does strike, allows us to recover date much faster than if we went back to the tape?
If you haven't heard of the mirrordir utility, do yourself a favor and check it out. For one thing, it's extremely simple to use, but it's also powerful. It is a tool no system administrator should be without.
Let's take a look at what mirrordir can do for us. First, if you haven't already done so, download mirrordir from mirrordir.sourceforge.net and choose from either the RPM or tar version. I prefer to use RPMs when they are available, so let's go ahead and install it as the root user, from the directory you downloaded it to, with rpm -ihv mirrordir*.rpm. Provided no dependency issues halted the installation (as of yet, I have found none with either Red Hat 7.2 or 8.0 installations), we can proceed.
What you choose to do with mirrordir depends on you. It's purpose in life is to make an exact mirror of a directory or directories and copy all data to a safe mirrored location. This can be a location on the same disk, on the same server on a separate disk, on an FTP site or on any other network storage site. Obviously, mirroring data to the same disk is not the best idea should the disk itself fail, but it can be done if you desire.
For our example, we will set up mirrordir to copy the exact contents from one external Compaq SCSI RAID cabinet to another identical external cabinet (like to like). Also, we will write a simple script to automate the task of keeping the data up-to-date and add this script to our weekly cron jobs (or daily jobs, if you prefer). Each external RAID cabinet, /dev/sda and /dev/sdb, houses six SCSI drives set up as a RAID-5 array. Once formatted as an ext3 filesystem, we have approximately 90GB of storage per cabinet.
We begin by creating a mountpoint for the external cabinets. As the root user, create a directory called ext_raid and another called mirror (or whatever you wish to call them). Next set the permissions on these mountpoints as follows, chmod -R 755. Once the mountpoints are in place and the permissions are set up as you like, do a dry run with mirrordir. First, assure that the two external SCSI arrays are mounted by issuing the mount -a command (which assumes that the external arrays have been added to the /etc/fstab file for mounting). Once both devices are mounted, run a mirrordir test:
mirrordir -v --dry-run /dev/sda /mirror
Pretty simple command line, huh? Don't be fooled though, as this is a very useful program. In our first example we used the dry-run switch, which simply lists the output of what would have been done. Before actually running mirrordir on your data, familiarize yourself with the man page listing all the available options. And, of course, have a current (yep, here it comes) backup available in case of an unforeseen booboo.
A word of caution here: be very careful with the order of the parameters you give to mirrordir. Should you reverse the order, you could replace data you are trying to make a copy of with data that exists on the target device--or you could delete everything on the source. This is one reason why you should do a dry-run first and examine all of the output mirrordir produces. That way, you are sure you are sending the correct data to the correct location. How are your backups?
Now let's run mirrordir for real. Copy the source data from one external RAID cabinet, called ext_raid, to a second cabinet, called mirror. This data consists of tons of technical documents, MP3s, applications, utilities and video from my SnapStream PVS server. Mirrordir will then make an exact copy of all my directories and data on the target RAID array (/mirror), maintaining the original security settings for the files and directories.
From this point forward, mirrordir needs only to append any changes to the data rather than do a full copy each time it is run. This feature is especially nice if you are mirroring over an FTP link across the network. Again, be sure to check out all of the cool options available in mirrordir. To name a few, you can use gzip to compress the data and create a tarball of the data on the target destination.
Now that we know mirrordir is working as promised, we can write a simple script to mount the target drive array, mirror our data to it, then unmount the array. This way the mirror disks are off-line and cannot be accidentally erased or modified. Open up your favorite editor and add something similar to this:
#!/bin/sh # This is mirror.sh - copy data from ext_raid cabinet to the mirror cabinet. # # this is mirrordir.sh: mirror of /ext_raid to another external raid cabinets # we assume that the source /ext_raid is already mounted, so we just mount the mirror /bin/mount /dev/sdb /mirror mirrordir /ext_raid /mirror /bin/umount /mirror
Next save the script and run chmod to make it executable by issuing the following:
chmod u+x mirror.sh
Now you can add the script to an existing cron job by dropping the script into one of the cron directories in /etc/cron.d (daily, weekly and so on).
So what about recovering from pooched data or a dead storage device? Once your target device is back up and running, simply reverse the previous mirrordir command:
mount /dev/sdb /mirror mirrordir /mirror /ext_mirror
Watch and smile as an exact copy of your data is fully restored to your (now repaired) original RAID cabinet or to the appropriate target destination. Of course, you also can point users to the /mirror cabinet, until the failed device is ready. Do I smell another script here?.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
- Weechat, Irssi's Little Brother
- New Products
- Tech Tip: Really Simple HTTP Server with Python
- Reply to comment | Linux Journal
20 min 8 sec ago - Didn't read
30 min 28 sec ago - Reply to comment | Linux Journal
35 min 28 sec ago - Poul-Henning Kamp: welcome to
2 hours 45 min ago - This has already been done
2 hours 46 min ago - Reply to comment | Linux Journal
3 hours 31 min ago - Welcome to 1998
4 hours 20 min ago - notifier shortcomings
4 hours 43 min ago - heroku?
6 hours 20 min ago - Android User
6 hours 22 min ago



Comments
Re: Mirrordir Can Make You a Hometown Hero!
Have you looked at rsync?
http://rsync.samba.org
The hardlink trick makes hourly/daily backups trivial, fast and space efficient. http://www.mikerubel.org/computers/rsync_snapshots/