The Skinny on Backups and Data Recovery, Part 2
This week, I will go out on a limb and make a gross generalization. If you're ready, here it is. An amazing number of Linux users and admins out there have CD re-recordable drives and no tape drive. You happily burn CDs to make collections of your favourite tunes, but have you thought about using it to create backups of your data?
A quick step back. Here's the scoop. There are ways to do backups right, but I want to start from worst-case scenario to best case. Meaning, soon we will be talking about tape drives. Some people out there might argue that "tape drives" sounds like an archaic way to deal with data storage. I'll answer that challenge next week. For now, I'm going to pretend you have a CD-recordable unit on your system, and show you how to use it for backups.
The question of whether a CD-RW is a good backup choice is sometimes settled in this way. You can afford either a tape drive (sometimes more expensive than the CD-RW) or the tape. When our machines get used for both business and pleasure, as is often the case with home offices, we tend to lean in the direction of "I want both".
Making collections of favorite songs and burning extra Slackware or Debian CDs is usually done with something called cdrecord. cdrecord is not specifically Linux software, and will compile and run on a number of different platforms. In case you don't already have cdrecord (maybe you just got a CD-RW last night), you can pick up the latest copy at this address:
The other thing you will need if you want to use your drive is a little package called mkisofs. Luckily, the latest version of cdrecord already includes mkisofs, but it is still available at this address:
I'm going to work on the premise that the reason we are having this discussion is you already have a working CD writer setup. You got a good deal; the writer makes great song collections, and now you think using that same device for backups is a great idea.
The problem with doing backups using CD writers and re-writers is that they were never intended for that. A raw image, based on a mirror of the data you intend to capture, is written to disk. You then write that image back to your CD or CD-RW. Essentially, you need twice the amount of free space you are trying to back up. In the case of a full CD at something like 650 meg, you must have 1.3 gigabytes of space. It's not necessarily the friendliest way to do backups (and certainly not the space-friendliest). Luckily, there's a way to cut the necessary space in half and still get your backups done. With a sufficiently fast system, you can simply pass the iso9660 image data that is being created directly to the cdrecord command. That means we do not need to have double the space available, the backup tree and an ISO image to then burn onto the CD.
One way to do this is to beef up last week's identity backup script. If you remember, we created a temporary directory with a hierarchical "mirror" of our important data, then backed up that smaller mirror to a diskette. With that script, I gave you a very small list of files and suggested that your choice of what's "important" may be different from mine. After all, you can fit roughly 1.4 meg on a single diskette, and my example identity backup used only 37K.
With the CD-RW, we can increase that size to roughly 650 megabytes, which may be all you need for the things that change day to day. Remember the catch (or half-catch, now): you still need that spare 650 megabytes into which you can recreate the structure you want to back up. We don't get away that easy. If you plan on backing up only 300 meg of data, then you'll need the 300 meg of space. The reason it's good to remember last week's example is that it is a micro-example of what we are about to do here.
We start by creating our mirror. In this case, it is a directory called /mnt/data1/data_backup. On my system, /mnt/data1 is a separate drive with a fair amount of free space. It's where I keep images of Linux CDs, which I then push on my friends in an effort to get them off that other operating system. On your system, the mirror will most likely be in a different location. Just make sure the space is available. Here is that script.
#!/bin/bash # script name : backup_to_cd # This script does a backup of important files onto the CD-RW # Marcel Gagne, 2000 # # NOTE: my "data mirror" is /mnt/data1/data_backup # echo "Starting by Blanking the data_backup area" rm -rf /mnt/data1/data_backup echo "Recreating the data_backup mirror ..." mkdir /mnt/data1/data_backup mkdir /mnt/data1/data_backup/usr mkdir /mnt/data1/data_backup/etc # echo "Backing up to data1 disk mirror area ..." cd / find home -print | cpio -pduvm /mnt/data1/data_backup find root -print | cpio -pduvm /mnt/data1/data_backup find usr/local -print | cpio -pduvm /mnt/data1/data_backup/usr # echo "Backing up system identity." cd /etc for ident_names in passwd group shadow profile bashrc sendmail.cw sendmail.cf hosts hosts.allow hosts.deny named.conf named.boot aliases do cp -v $ident_names /mnt/data1/data_backup/etc done find nsdata -print | cpio -pduvm /mnt/data1/data_backup/etc find sysconfig -print | cpio -pduvm /mnt/data1/data_backup/etc find mgetty+sendfax -print | cpio -pduvm /mnt/data1/data_backup/etc # echo "All files saved. Ready to begin CD copy." echo "Shall I blank the CD first?" read the_answer # cdrecord -blank=fast dev=3,0 # echo "Shall I start the CD burn now?" read the_answer # mkisofs -R /mnt/data1/data_backup | cdrecord -v dev=3,0 -
Notice that in my /mnt/data1/data-backup mirror, I am capturing /home, /usr/local and /root, none of which I was paying much attention to with my original identity_backup script. After creating our mirror, we immediately burn the data to our disk.
Yes? Ah. The reader in the back has a good point. I'm not really doing anything at those prompts for blanking and copying the CD (where it says read the_answer) other than pausing. Since the amount of data in my mirror can be pretty dynamic, not to mention downright huge, I want an opportunity to do a du -sk /mnt/data1/data_backup to verify that I'm staying within that 650 meg limit.
Back to the script. Since I am using a re-writable CD, I blank my CDs before starting. This is done with the line
cdrecord -blank=fast dev=3,0
I use the -blank=fast option to quickly erase the table of contents from the disk. You have the option of blanking the entire disk, but that can take a long time.
The real magic happens at the end of the script with "mkisofs" and "cdrecord". The -R option on mkisofs means I want the Rock Ridge extensions to be used. In other words, a UNIX file system with user and group information, long filename support, etc. That's about it. One other thing, though: just as in our last example, this script is meant to be a jumping-off point for your own CD backup. What I consider important in my backup may differ wildly from yours.
For the curious, here is the normal chain of events in creating a CD. You would have mkisofs write out an ISO9660 image, which would then get recorded on the CD. The final backup onto the CD would then happen in two passes. For instance, the commands would be:
# First we create the image based on a previously done "mirror" backup mkisofs -o /another_dir/image.iso -R /mnt/data1/data_backup # Now, we record the image to CD cdrecord -v dev=3,0 /another_dir/image.iso
Incidentally, if you have a few ISO images hanging around on your disk (spare Linux CDs for your friends, etc.), you can mount those images and navigate them as you would any CD filesystem. Here's how. Pretend your debian.iso image is sitting out there on your disk, and you want to look at it. First, you create a mount point (mkdir /mnt/debdist). Next, using this command, you can mount the image:
mount debian.iso -r -t iso9660 -o loop /mnt/debdist
The -r option means read-only. As you can see, it is very much like a CD filesystem.
Next time, I'll take up that challenge from earlier, and show you why tapes are still where it's at, and just how flexible those "old-fashioned" devices can be. On that note, we wrap up another week here at the corner. Until we chat again, remember that when you're down, only a good backup will get you back up.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
- Doing for User Space What We Did for Kernel Space
- Google's SwiftShader Released
- Parsing an RSS News Feed with a Bash Script
- SourceClear Open