Tarsnap: On-line Backups for the Truly Paranoid

Your Current Account Balance Is $4.992238237884881224

Tarsnap works on a prepaid utility-metered model. Subscribers deposit a minimum of $5.00 and are charged only for the storage and bandwidth they consume. Although the cost is higher than plain Amazon S3 service, it reflects both the cryptographic, compression and deduplication value-add of Tarsnap. At the time of this writing, Tarsnap costs 30 cents per gigabyte-month for storage and 30 cents per gigabyte transmitted.

This cost may make Tarsnap infeasible for large, whole-server terabyte-size backups. However, it is ideal for critical, sensitive files that must be durable, available and safe in the event an attacker succeeded in compromising them. With no minimum charge or monthly fees, Tarsnap is very economical for small data sets or for data that compresses well. Some examples:

  • Backing up 100MB of files with 10% daily change rate for a month would cost only 30 cents.

  • A gigabyte that is backed up weekly with a 20% change rate would cost $1.40 a month.

Tarsnap bills based on attodollars (quintillionths of a dollar) to avoid profiting through rounding. This means your account balance is tracked to 18 decimal places. This is not just "pay by the drink" cloud pricing—it's practically "pay by the atom". Some users find that a small deposit lasts them months or years.

Important Flexibility

One of Tarsnap's best features is how easy it is to script. The ability to put a tarsnap cf command into a shell script makes use in cron jobs very straightforward, which encourages unattended, automated backups—the best kind.

Crucially, Tarsnap also supports a division of responsibilities. You can use the tarsnap-keymgmt tool to create keyfiles with limited authority. You may have one keyfile that lives on your server with permission to create archives, but not the authority to delete them. A master key with full privileges could be kept off-site, so that if attackers were to compromise your server, they would be unable to destroy your backups.

Using Tarsnap

To get started with Tarsnap, register at tarsnap.com, deposit some funds into your account, and download the client.

The client is available only as source, but the straightforward ./configure ; make install process is very easy. The client is supported on all major Linux distributions (as well as BSD-based systems). Take a quick peek at the download page to make sure you have the required operating system packages, as some of the development packages are not installed in typical Linux configurations.

If you are using a firewall, be aware that Tarsnap communicates via TCP on port 9279.

There are only two critical configuration items: the location of your keyfile and the location of your Tarsnap cache. Both are set in /usr/local/etc/tarsnap.conf. A tarsnap.conf.example is provided, and you probably can just copy the example as is. It defines your Tarsnap key as /root/tarsnap.key and your cache directory as /usr/local/tarsnap-cache, which will be created if it doesn't exist. The cachedir is a small state-tracking directory that lets Tarsnap keep track of backups.

Next, register your machine as follows. In this case, I'm setting up Tarsnap service for a machine called helicarrier. The e-mail address and password are the ones I used when I signed up for service with Tarsnap:

# tarsnap-keygen --keyfile /root/tarsnap.key 
 ↪--user andrew@fabbro.org --machine helicarrier
Enter tarsnap account password:

I have a directory I'd like to back up with Tarsnap:

# ls -l /docs
total 2092
-rw-rw---- 1 andrew 1833222 Jun 14 16:38 2011 Tax Return.pdf
-rw------- 1 andrew 48568 Jun 14 16:41 andrew_passwords.psafe3
-rw------- 1 tina   14271 Jun 14 16:42 tina_passwords.psafe3
-rw-rw-r-- 1 andrew 48128 Jun 14 16:41 vacation_hotels.doc
-rw-rw-r-- 1 andrew 46014 Jun 14 16:35 vacation_notes.doc
-rw-rw-r-- 1 andrew 134959 Jun 14 16:44 vacation_reservation.pdf

To back up, I just tell Tarsnap what name I want to call my archive ("docs.20120701" in this case) and which directory to back up. There's no requirement to use a date string in the archive name, but it makes versioning straightforward, as you'll see:

# tarsnap cf docs.20120701 /docs
tarsnap: Removing leading '/' from member names
                                 Total size  Compressed size
All archives                        2132325          1815898
  (unique data)                     2132325          1815898
This archive                        2132325          1815898
New data                            2132325          1815898

In my tarsnap.conf, I enabled the print-stats directive, which gives the account report shown. Note the compression, which reduces storage costs and improves cryptographic security. The "compressed size" of the "unique data" shows how much data is actually stored at Tarsnap, and you pay only for the compressed size.

The next day, I back up docs again to "docs.20120702". If I haven't made many changes, the backup will proceed very quickly and use little additional space:

# tarsnap cf docs.20120702 /docs
tarsnap: Removing leading '/' from member names
                                 Total size  Compressed size
All archives                        4264650          3631796
  (unique data)                     2132770          1816935
This archive                        2132325          1815898
New data                                445             1037

As you can see, although the amount of data for "all archives" has grown, the actual amount of "unique data" has barely increased. Tarsnap is smart enough to avoid backing up data that has not changed.


Andrew Fabbro is a senior technologist living in the Portland, Oregon, area. He's used Linux since Slackware came on floppies and presently works for Con-way, a Fortune 500 transportation company.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

great article

RSA Course Online's picture

I use duplicity too.


RonTrex's picture

I'm curious how Tarsnap measures up against encryption brute force on GPU clusters made to crack passwords. It wasn't that long ago a machine was used to crack any password within 5 hours through a large cluster of GPUs. I'm still a bit skeptical of cloud backups, but perhaps I should be more afraid of using Gmail for that matter. At least Tarsnap doesn't trap my info for eternity with no knowledge of who has access to it. - Ron @ bpl

http? It seems like they do

yang's picture

http? It seems like they do not care about security anyway.

duplicy offers same features -- on own system

volker's picture

I use duplicity (http://duplicity.nongnu.org/) for a year which offers the same features beside the commercial backup space.

Duplicity use OpenGPG as encryption (key or passphrase based).

From the page, duplicity supports "local file storage, scp/ssh, ftp, rsync, HSI, WebDAV, Tahoe-LAFS, and Amazon S3".

Yes, you need your own backup space. But there are reliable 100GB storage for $5 to $7/month out (and a lot other ones, some smaller but free...).

Thanks for the post. Nice

Anonymous's picture

Thanks for the post. Nice solution for an on-line backup, liked the price scheme too. And of source: that it's a tar based solution.

One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix