Painless Thumbdrive Backups
February 1st, 2007 by Andrew Fabbro in
Raise your hand if you've ever lost (or worried you'd lost) a USB thumbdrive. You spent hours fruitlessly searching the house, and then as you opened the washing machine door, it suddenly dawned on you that perhaps you didn't check your pockets thoroughly when you put this load in.
Fortunately, you have a backup of all the data, right? You religiously mount the drive and copy the data to a backup directory on a regular schedule, no?
That sounds an awful lot like drudgery to me too, and I got into computers to avoid boring work. Naturally, it's a lot more fun to spend some time working out the perfect method for painless thumbdrive backups.
What do I mean by painless? How about a system where you can walk up to your Linux box, plug in the drive, wait for a “backup complete” sound, unplug and walk away? Perhaps a system that keeps its backups orderly (say, the last seven copies)? Oh, and it should handle encrypted thumbdrives as well. And, if you need to recover, it should do both whole-volume replacement and per-file restores.
Not a problem. The key to this system is using udev rules and a simple shell script. The tools already are on your system. In this example, I use a CentOS 4.3 system, though any Linux distribution with a 2.6 kernel should work.
udev is the modern device manager for Linux, replacing the 2.4 kernel's devfs. udev handles all device mapping, including hot plugging of devices. One of its coolest features is it lets you write your own event rules. This article shows you how to craft a rule that automatically fires when you plug your USB thumbdrive in to the system.
These rules are stored in /etc/udev/rules.d (if you're using a different Linux distribution, check /etc/udev/udev.conf for the udev_rules= line, which should point to the rules directory). You can place whatever udev rules you want as text files in this directory, and udev picks them up immediately for use without requiring a reboot.
To write a udev event rule, you first need a unique way to identify the USB device. Most thumbdrives have serial numbers, though not all. Fortunately, even with thumbdrives that do not have a serial number, you can craft udev rules for them.
I use two thumbdrives as examples: a JetFlash JF110, encrypted with TrueCrypt, and a Corsair Flash Voyager. The JetFlash has a serial number; the Corsair does not.
Plug your thumbdrive in, and cat /proc/scsi/usb-storage/*. You should find an entry for it similar to this:
Host scsi5: usb-storage
Vendor: Unknown
Product: USB Mass Storage Device
Serial Number: 85a5b1f2c96492
Protocol: Transparent SCSI
Transport: Bulk
Quirks:
If you have a serial number, skip forward to the “Writing the Rule” section of this article. If you see “None” for the Serial Number, you still can identify the device by using udevinfo. Follow these steps:
1) Look at dmesg's output. Typical output is as follows:
usb-storage: waiting for device to settle before scanning Vendor: Corsair Model: Flash Voyager Rev: 1.00 Type: Direct-Access ANSI SCSI SCSI device sde: 2031616 512-byte hdwr sectors (1040 MB) [...] sde: assuming drive cache: write through sde: sde1 Attached scsi removable disk sde at scsi12, channel 0, id 0, lun 0 Attached scsi generic sg4 at scsi12, channel 0, id 0, lun 0, type 0
This tells you that /dev/sde is the device assigned.
2) Now, run:
udevinfo -a -p $(udevinfo -q path -n /dev/sde)
and examine the output. Look for these lines:
BUS=="scsi"
SYSFS{model}=="Flash Voyager "
SYSFS{vendor}=="Corsair "Now, with either the serial number or the vendor/model combo, you can write the rule. The rule creates a symlink for the device in the /dev tree, for example, /dev/corsair_drive, and then calls the script /usr/local/bin/backup-thumb.sh, which I'll get to in a moment.
Become root (su -), and create a text file in /etc/udev/rules.d called 95.backup.rules. You can use a number other than 95, but keep in mind that udev processes rules in alphanumeric order, and it's better to have local rules processed last.
If you have a serial number, type a rule like this (all on one line) into the file, and save it:
BUS="usb", SYSFS{serial}="85a5b1f2c96492", SYMLINK="jet_drive",
RUN+="/usr/local/bin/backup-thumb.sh jet_drive "If you're using vendor/model identification, your rule would look like this:
BUS="scsi", SYSFS{vendor}=="Corsair ", SYSFS{model}=="Flash Voyager ",
SYMLINK="corsair_drive", RUN+="/usr/local/bin/backup-thumb.sh
corsair_drive"Note that you can string as many SYSFS{} entries together as you need to identify the drive uniquely. Your rule now fires every time you plug in your thumbdrive.
Note: if you have other rules for a device, udev executes the rules in sequence from top to bottom.
backup-thumb.sh is the engine that backs up your thumbdrive. Our rule calls it, giving the name of the device (the SYMLINK) as its only argument. Everything else is configured in the CONFIG section. The backup script is shown in Listing 1.
Put this script in /usr/local/bin/backup-thumb.sh, and remember to chmod +x it. Next, edit the CONFIG section—the parameters are as follows:
BACKUP_DIR: where you want the backups to go.
GENERATIONS: how many days of backups to keep. Backups will be numbered 0 (most recent) to the limit you enter (oldest). Keep in mind that you need to have enough storage space for this many backups. If you are backing up a 1GB fob and set GENERATIONS to 7, backups will consume 7GB of space.
BACKUP_ONCE_DAY: if you plug and unplug your fob multiple times a day, you probably won't want to back it up each time. backup-thumb.sh uses a tag file so that it backs up only once per day. If you want to change this so it runs a backup every time you plug in a thumbdrive, set BACKUP_ONCE_DAY to 0.
SOUND: in this example, I've chosen a sound from the KDE distribution, but any WAV file will work. You easily can modify the script to use madplay instead of aplay and use an MP3 file as your completion sound.
backup-thumb.sh sleeps for ten seconds on startup, because it must wait for the kernel to finish scanning the thumbdrive. If you plug in a thumbdrive and type dmesg, you'll see a “waiting for device to settle” message while this happens. Ten seconds for the kernel scan should be sufficient even for older machines.
Next, backup-thumb.sh sets permissions tightly so that only root can read the backups. Otherwise, some nefarious person could copy your backup to a different machine and mount it there.
The script executes a simple dd (bit-for-bit copy) of your thumbdrive to a backup file. This works whether the device is encrypted or not. When it's finished, it plays a noise you will hear on your computer's speakers. On a USB 2.0 port, backing up a 1GB thumbdrive takes about one minute.
If you lose your thumbdrive and want to restore your backup to its replacement, simply dd the backup image to the new thumbdrive, like so:
dd if=corsair_drive.backup.0 of=/dev/corsair_drive
Or, if you want to grab only some files from the backup, do the following:
mkdir /mnt/thumb mount -o loop corsair_drive.backup.0 /mnt/thumb
You now can copy the files from /mnt/thumb.
If you're using TrueCrypt to encrypt your thumbdrive, you can mount the backup image in much the same way:
truecrypt corsair_drive.backup.0 /mnt/thumb/
That's about as painless as we can make thumbdrive backups. If you're too lazy to plug your drive in and come back when it beeps...well, stay away from laundromats!
Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
Subscribe now!
The Latest
Newsletter
Tech Tip Videos
- Nov-04-09
- Oct-29-09
- Oct-26-09
Recently Popular
From the Magazine
December 2009, #188
If last month's Infrastrucuture issue was too "big" for you then try on this month's Embedded issue. Find out how to use Player for programming mobile robots, build a humidity controller for your root cellar, find out how to reduce the boot time of your embedded system, and if you're new to embedded systems find out the basics that go into one. You can also read about the Beagle Board, the Mesh Potato and a spate of other interestingly named items. And along with our regular columns don't miss our new monthly column: Economy Size Geek.
Delicious
Digg
StumbleUpon
Reddit
Facebook








Please see my question above
On September 24th, 2008 Benjamin Cathey (not verified) says:
Please see my question above - no one ever responded with any help. I thought it was because no one read this - but there have been posts since.
Automatically backup any USB storage device
On September 3rd, 2008 Thomas Damgaard (not verified) says:
Hi
I have a server that only serves as backup server.
I've been trying to make a udev rule that would automatically backup any USB storage device connected.
This way, I can just plug in my USB devices to my server, and it is automatically backed up.
However, what I have made so far does not work. I hope you can help me.
Here is my udev rule:
BUS=="scsi" KERNEL=="sd?1" SUBSYSTEM=="block" ATTRS{removable}=="1" ACTION=="add" RUN+="/usr/local/bin/copy-drive.sh %k"I hope you can help.
Cannot get this script working
On July 3rd, 2008 Benjamin Cathey (not verified) says:
I read you magazine regularly and was glad when I found this article. However I cannot get it working at all --
I am running Ubuntu 8.04
For starters, when I run the udev check I do not get any values that speficy SYSFS, they are all ATTRS (although there is a line for serial and model) - also there is no BUS line at all.
I tried writing the rule using ATTRS and nothing, I also tried writing the rule using SYSFS (even though that parameter did not appear) and nothing -
They symlink is not even created.
HELP please
So I never heard back on this???
On September 24th, 2008 Benjamin Cathey (not verified) says:
Well, I asked for help and I never heard back on this - the output of udevinfo -a -p $(udevinfo -q path -n /dev/sdf) does not result in anything similar to what you are suggesting. There is no sysctl line or bus line - i see similar values in here but they are labelled attrs and the udev script just won't work
This is what I ended up making
root@lighthouse:/etc/udev/rules.d# cat 96-backuphome.rules
BUS=="usb", SYSFS{serial}=="0010101640150EE9W", SYMLINK=="tosh_ext", RUN+="/home/benito/scripts/homebackuponplugin.sh tosh_ext"
root@lighthouse:/etc/udev/rules.d#
Although usb is listed as
SUBSYSTEMS=="usb"
DRIVERS=="usb"
NOT the BUS (although I know that it is) ... and serial looks like ATTRS{serial}== not SYSFS{serial}== as suggested in this article. I figured the reason I hadn't heard back is that no one read this. I read your magazine monthly - maybe I shouldn't bother if I can't get a reply?
Thanks,
Benjamin
How to recover
On January 7th, 2007 Derk Tattersall (not verified) says:
At the end of your article, you state that you can recover files from the image usong mount like so:
mkdir /mnt/thumb
mount -o loop corsair_drive.backup.0 /mnt/thumb
My own thumb drive (and most such drives, I think) has the data partition on a partition within the drive. You have to use a different mount command:
mount -o loop, offset=xxxxx corsair_drive.backup.0 /mnt/thumb
Determining the value of the offset is a pain. I found a script at http://www.number.ch/wiki/index.php/PartitionRecovery that makes it much easier:
#/bin/sh
offset=$1; shift
limit=$1; shift
while [ $offset -le $limit ]
do if mount -o ro,loop,offset=$offset $* 2> /dev/null
then echo " Successfully mounted starting from offset $offset."
exit 0
fi
offset=$(($offset+1))
[ $(($offset % 1000)) == 0 ] && echo -n . # Progress indicator
done
echo "No filesystem found up to $offset."
exit 1
I found the article very useful. Thanks.
Derek Tattersall
usb key partitions
On February 16th, 2007 Jeff Pipkins (not verified) says:
I found it instructive to write the run rule like this:
RUN+="/usr/local/bin/usbkey.sh myserialnum %k"
Then in the script I added echo $0 $@ >>/tmp/log.txt
I found that the script was called several times, with different device names. Then I removed the echo and added an if [ "$2" = "sdb2" ]
so I could mount only the partition I wanted.
I added a mount line in /etc/fstab and used the uid= and gid= to set myself as the owner. I have the script mount the drive, and luckily enough, when I remove the key, the mount goes away.
BTW, I don't use the key for backup, but I've found that the "unison" utility is very useful for syncing the data on the key with the data on either of two systems.
What I'd really like to do is to pop up a window, like gnome-terminal or xterm or something, and then execute an optionally interactive script. Anybody know how to do that? I tried sudo -u jpipkins gnome-terminal, but that didn't work.
Correction/Diff that worked for me
On January 7th, 2007 will says:
Great article. This is something I've had to manually do and now I'm free of that task. Yahooo!
I struggled a little at first because it just didn't work straight away. I'm running Ubuntu Edgy. Then I followed the link to Daniel Drake's "Writing udev Rules" and noticed that his examples all used "==" instead of "=". Each declaration in the udev rule seems to need 2 math symbols. They should be "==" or "+=". Here is my rule and it worked great. Oh the joy!
BUS=="usb", SYSFS{serial}=="00176F962D19E", SYMLINK=="cruzer", RUN+="/usr/local/bin/backup-thumb.sh cruzer "
Now I just need new laptop with USB2.0 as a gig thumbdrive takes 30 mins to backup.
rsync
On March 1st, 2007 Luis Sismeiro (not verified) says:
Why not use rsync to backup only the modified files? It isn't difficult if the flash isn't encryped.
Regards,
Luis Sismeiro
generations
On March 21st, 2007 Bill Arlofski (not verified) says:
Rsync is a great solution for keeping files and/or directories in sync, and is much faster than copying the whole thing each time.
But, rsync is not so great if you sync, then realize that you need a specific version of a file from 2 days ago.
--
Bill Arlofski
Reverse Polarity
rsync snapshots
On March 21st, 2007 Chris (not verified) says:
You can get the best of both worlds though (speed of rsync + multiple versions), with the added bonus of consuming less space than multiple full copies.
http://www.mikerubel.org/computers/rsync_snapshots/
The solution described there gives the illusion of multiple full copies, while only requiring the space of one copy plus the sum of the deltas.
Post new comment