Scary Backup Stories

The dangers of not testing your backup procedures and some common pitfalls to avoid.

Backups. We all know the importance of making a backup of our most important systems. Unfortunately, some of us also know that realizing the importance of performing backups often is a lesson learned the hard way. Everyone has their scary backup stories. Here are mine.

Scary Story #1

Like a lot of people, my professional career started out in technical support. In my case, I was part of a help-desk team for a large professional practice. Among other things, we were responsible for performing PC LAN backups for a number of systems used by other departments. For one especially important system, we acquired fancy new tape-backup equipment and a large collection of tapes. A procedure was put in place, and before-you-go-home-at-night backups became a standard. Some months later, a crash brought down the system, and all the data was lost. Shortly thereafter, a call came in for the latest backup tape. It was located and dispatched, and a recovery was attempted. The recovery failed, however, as the tape was blank. A call came in for the next-to-last backup tape. Nervously, it was located and dispatched, and a recovery was attempted. It also failed because this tape also was blank. Amid long silences and pink-slip glares, panic started to set in as the tape from three nights prior was called up. This attempt resulted in a lot of shouting.

All the tapes were then checked, and they were all blank. To add insult to injury, the problem wasn't only that the tapes were blank--they weren't even formatted! The fancy new backup equipment wasn't smart enough to realize the tapes were not formatted, so it allowed them to be used. Note: writing good data to an unformatted tape is never a good idea.

Now, don't get me wrong, the backup procedures themselves were good. The problem was that no one had ever tested the whole process--no one had ever attempted a recovery. Was it no small wonder then that each recovery failed?

For backups to work, you need to do two things: (1) define and implement a good procedure and (2) test that it works.

To this day, I can't fathom how my boss (who had overall responsibility for the backup procedures) managed not to get fired over this incident. And what happened there has always stayed with me.

A Good Solution

When it comes to doing backups on Linux systems, a number of standard tools can help avoid the problems discussed above. Marcel Gagné's excellent book (see Resources) contains a simple yet useful script that not only performs the backup but verifies that things went well. Then, after each backup, the script sends an e-mail to root detailing what occurred.

I'll run through the guts of a modified version of Marcel's script here, to show you how easy this process actually is. This bash script starts by defining the location of a log and an error file. Two mv commands then copy the previous log and error files to allow for the examination of the next-to-last backup (if required):

    #! /bin/bash
    backup_log=/usr/local/.Backups/backup.log
    backup_err=/usr/local/.Backups/backup.err
    mv $backup_log $backup_log.old
    mv $backup_err $backup_err.old

With the log and error files ready, a few echo commands append messages (note the use of >>) to each of the files. The messages include the current date and time (which is accessed using the back-ticked date command). The cd command then changes to the location of the directory to be backed up. In this example, that directory is /mnt/data, but it could be any location:

    echo "Starting backup of /mnt/data: `date`." >> $backup_log
    echo "Errors reported for backup/verify: `date`." >> $backup_err
    cd /mnt/data

The backup then starts, using the tried and true tar command. The -cvf options request the creation of a new archive (c), verbose mode (v) and the name of the file/device to backup to (f). In this example, we backup to /dev/st0, the location of an attached SCSI tape drive:

    tar -cvf /dev/st0 . 2>>$backup_err

Any errors produced by this command are sent to STDERR (standard error). The above command exploits this behaviour by appending anything sent to STDERR to the error file as well (using the 2>> directive).

When the backup completes, the script then rewinds the tape using the mt command, before listing the files on the tape with another tar command (the -t option lists the files in the named archive). This is a simple way of verifying the contents of the tape. As before, we append any errors reported during this tar command to the error file. Additionally, informational messages are added to the log file at appropriate times:

    mt -f /dev/st0 rewind
    echo "Verifying this backup: `date`" >>$backup_log
    tar -tvf /dev/st0 2>>$backup_err
    echo "Backup complete: `date`" >>$backup_log

To conclude the script, we concatenate the error file to the log file (with cat), then e-mail the log file to root (where the -s option to the mail command allows the specification of an appropriate subject line):

    cat $backup_err >> $backup_log
    mail -s "Backup status report for /mnt/data" root < $backup_log

And there you have it, Marcel's deceptively simple solution to performing a verified backup and e-mailing the results to an interested party. If only we'd had something similar all those years ago.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Scary Backup Stories

Anonymous's picture

I first ccrypt all my files, tar them, gzip them, and sftp them to multiple servers.

This seems kinda stupid to me. Once the files are crypted, you won't be able to do much compression on them. You should do the crypt step as the last step, or at least gzip your files before you crypt them.

-Matt

Re: Scary Backup Stories

Anonymous's picture

..when's the last time you did a test restore to make sure those tapes have something useful on them?..

Re: Scary Backup Stories

Anonymous's picture

Forget the tape all together. Get a remote (offsite) backup service to grab your backup file nightly, weekly, whatever. Then it's offsite, insist on email reports that they were done, and they'll backup your machine using cron so they don't forget!

Re: Scary Backup Stories

Anonymous's picture

the point of tape is not to perform data restores

it gives you something to hand to someone else. it takes them ages to discover its useless by then you can copy the data from a mirror. in the first scary store tapes bought hime 3 days to look for a new job whilst getting paid

all cough 'more experienced' it managers and pow'd'er users will expect tapes to be involved

keep people happy don't try to confuse then with sensible new ideas