Scary Backup Stories
Backups. We all know the importance of making a backup of our most important systems. Unfortunately, some of us also know that realizing the importance of performing backups often is a lesson learned the hard way. Everyone has their scary backup stories. Here are mine.
Like a lot of people, my professional career started out in technical support. In my case, I was part of a help-desk team for a large professional practice. Among other things, we were responsible for performing PC LAN backups for a number of systems used by other departments. For one especially important system, we acquired fancy new tape-backup equipment and a large collection of tapes. A procedure was put in place, and before-you-go-home-at-night backups became a standard. Some months later, a crash brought down the system, and all the data was lost. Shortly thereafter, a call came in for the latest backup tape. It was located and dispatched, and a recovery was attempted. The recovery failed, however, as the tape was blank. A call came in for the next-to-last backup tape. Nervously, it was located and dispatched, and a recovery was attempted. It also failed because this tape also was blank. Amid long silences and pink-slip glares, panic started to set in as the tape from three nights prior was called up. This attempt resulted in a lot of shouting.
All the tapes were then checked, and they were all blank. To add insult to injury, the problem wasn't only that the tapes were blank--they weren't even formatted! The fancy new backup equipment wasn't smart enough to realize the tapes were not formatted, so it allowed them to be used. Note: writing good data to an unformatted tape is never a good idea.
Now, don't get me wrong, the backup procedures themselves were good. The problem was that no one had ever tested the whole process--no one had ever attempted a recovery. Was it no small wonder then that each recovery failed?
For backups to work, you need to do two things: (1) define and implement a good procedure and (2) test that it works.
To this day, I can't fathom how my boss (who had overall responsibility for the backup procedures) managed not to get fired over this incident. And what happened there has always stayed with me.
When it comes to doing backups on Linux systems, a number of standard tools can help avoid the problems discussed above. Marcel Gagné's excellent book (see Resources) contains a simple yet useful script that not only performs the backup but verifies that things went well. Then, after each backup, the script sends an e-mail to root detailing what occurred.
I'll run through the guts of a modified version of Marcel's script here, to show you how easy this process actually is. This bash script starts by defining the location of a log and an error file. Two mv commands then copy the previous log and error files to allow for the examination of the next-to-last backup (if required):
#! /bin/bash backup_log=/usr/local/.Backups/backup.log backup_err=/usr/local/.Backups/backup.err mv $backup_log $backup_log.old mv $backup_err $backup_err.old
With the log and error files ready, a few echo commands append messages (note the use of >>) to each of the files. The messages include the current date and time (which is accessed using the back-ticked date command). The cd command then changes to the location of the directory to be backed up. In this example, that directory is /mnt/data, but it could be any location:
echo "Starting backup of /mnt/data: `date`." >> $backup_log echo "Errors reported for backup/verify: `date`." >> $backup_err cd /mnt/data
The backup then starts, using the tried and true tar command. The -cvf options request the creation of a new archive (c), verbose mode (v) and the name of the file/device to backup to (f). In this example, we backup to /dev/st0, the location of an attached SCSI tape drive:
tar -cvf /dev/st0 . 2>>$backup_err
Any errors produced by this command are sent to STDERR (standard error). The above command exploits this behaviour by appending anything sent to STDERR to the error file as well (using the 2>> directive).
When the backup completes, the script then rewinds the tape using the mt command, before listing the files on the tape with another tar command (the -t option lists the files in the named archive). This is a simple way of verifying the contents of the tape. As before, we append any errors reported during this tar command to the error file. Additionally, informational messages are added to the log file at appropriate times:
mt -f /dev/st0 rewind echo "Verifying this backup: `date`" >>$backup_log tar -tvf /dev/st0 2>>$backup_err echo "Backup complete: `date`" >>$backup_log
To conclude the script, we concatenate the error file to the log file (with cat), then e-mail the log file to root (where the -s option to the mail command allows the specification of an appropriate subject line):
cat $backup_err >> $backup_log mail -s "Backup status report for /mnt/data" root < $backup_log
And there you have it, Marcel's deceptively simple solution to performing a verified backup and e-mailing the results to an interested party. If only we'd had something similar all those years ago.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Doing for User Space What We Did for Kernel Space
- SuperTuxKart 0.9.2 Released
- Parsing an RSS News Feed with a Bash Script
- Google's SwiftShader Released
- SourceClear Open