Scary Backup Stories
Backups. We all know the importance of making a backup of our most important systems. Unfortunately, some of us also know that realizing the importance of performing backups often is a lesson learned the hard way. Everyone has their scary backup stories. Here are mine.
Like a lot of people, my professional career started out in technical support. In my case, I was part of a help-desk team for a large professional practice. Among other things, we were responsible for performing PC LAN backups for a number of systems used by other departments. For one especially important system, we acquired fancy new tape-backup equipment and a large collection of tapes. A procedure was put in place, and before-you-go-home-at-night backups became a standard. Some months later, a crash brought down the system, and all the data was lost. Shortly thereafter, a call came in for the latest backup tape. It was located and dispatched, and a recovery was attempted. The recovery failed, however, as the tape was blank. A call came in for the next-to-last backup tape. Nervously, it was located and dispatched, and a recovery was attempted. It also failed because this tape also was blank. Amid long silences and pink-slip glares, panic started to set in as the tape from three nights prior was called up. This attempt resulted in a lot of shouting.
All the tapes were then checked, and they were all blank. To add insult to injury, the problem wasn't only that the tapes were blank--they weren't even formatted! The fancy new backup equipment wasn't smart enough to realize the tapes were not formatted, so it allowed them to be used. Note: writing good data to an unformatted tape is never a good idea.
Now, don't get me wrong, the backup procedures themselves were good. The problem was that no one had ever tested the whole process--no one had ever attempted a recovery. Was it no small wonder then that each recovery failed?
For backups to work, you need to do two things: (1) define and implement a good procedure and (2) test that it works.
To this day, I can't fathom how my boss (who had overall responsibility for the backup procedures) managed not to get fired over this incident. And what happened there has always stayed with me.
When it comes to doing backups on Linux systems, a number of standard tools can help avoid the problems discussed above. Marcel Gagné's excellent book (see Resources) contains a simple yet useful script that not only performs the backup but verifies that things went well. Then, after each backup, the script sends an e-mail to root detailing what occurred.
I'll run through the guts of a modified version of Marcel's script here, to show you how easy this process actually is. This bash script starts by defining the location of a log and an error file. Two mv commands then copy the previous log and error files to allow for the examination of the next-to-last backup (if required):
#! /bin/bash backup_log=/usr/local/.Backups/backup.log backup_err=/usr/local/.Backups/backup.err mv $backup_log $backup_log.old mv $backup_err $backup_err.old
With the log and error files ready, a few echo commands append messages (note the use of >>) to each of the files. The messages include the current date and time (which is accessed using the back-ticked date command). The cd command then changes to the location of the directory to be backed up. In this example, that directory is /mnt/data, but it could be any location:
echo "Starting backup of /mnt/data: `date`." >> $backup_log echo "Errors reported for backup/verify: `date`." >> $backup_err cd /mnt/data
The backup then starts, using the tried and true tar command. The -cvf options request the creation of a new archive (c), verbose mode (v) and the name of the file/device to backup to (f). In this example, we backup to /dev/st0, the location of an attached SCSI tape drive:
tar -cvf /dev/st0 . 2>>$backup_err
Any errors produced by this command are sent to STDERR (standard error). The above command exploits this behaviour by appending anything sent to STDERR to the error file as well (using the 2>> directive).
When the backup completes, the script then rewinds the tape using the mt command, before listing the files on the tape with another tar command (the -t option lists the files in the named archive). This is a simple way of verifying the contents of the tape. As before, we append any errors reported during this tar command to the error file. Additionally, informational messages are added to the log file at appropriate times:
mt -f /dev/st0 rewind echo "Verifying this backup: `date`" >>$backup_log tar -tvf /dev/st0 2>>$backup_err echo "Backup complete: `date`" >>$backup_log
To conclude the script, we concatenate the error file to the log file (with cat), then e-mail the log file to root (where the -s option to the mail command allows the specification of an appropriate subject line):
cat $backup_err >> $backup_log mail -s "Backup status report for /mnt/data" root < $backup_log
And there you have it, Marcel's deceptively simple solution to performing a verified backup and e-mailing the results to an interested party. If only we'd had something similar all those years ago.
- Three More Lessons
- Django Models and Migrations
- August 2015 Issue of Linux Journal: Programming
- Hacking a Safe with Bash
- Secure Server Deployments in Hostile Territory, Part II
- The Controversy Behind Canonical's Intellectual Property Policy
- Huge Package Overhaul for Debian and Ubuntu
- Shashlik - a Tasty New Android Simulator
- Embed Linux in Monitoring and Control Systems
- General Relativity in Python