Compression Tools Compared
Filters are tools that can be chained together at the command line so that the output of one is piped elegantly into the input of the next. A common example is:
$ ls | more
Filtering is crucial for speeding up network transfers. Without it, you have to wait for all the data to be compressed before transferring any of it, and you need to wait for the whole transfer to complete before starting to decompress. Filters speed up network transfers by allowing data to be simultaneously compressed, transferred and decompressed. This happens with negligible latency if you're sending enough data. Filters also eliminate the need for an intermediate archive of your files.
Check whether the data compression tool that you want is installed on both computers. If it's not, you can see where to get it in the on-line Resources for this article. Remember to replace a/dir in the following examples with the real path of the data to back up.
Unless your data already is in one big file, be smart and consolidate it with a tool such as tar. Aggregated data has more redundancy to winnow out, so it's ultimately more compressible.
But be aware that the redundancy that saps your performance also may make it easier to recover from corruption. If you're worried about corruption, you might consider testing for it with the cksum command or adding a limited amount of redundancy back into your compressed data with a tool such as parchive or ras.
lzop often is the fastest tool. It finishes about three times faster than gzip but still compresses data almost as much. It finishes about a hundred times faster than lzma and 7za. Furthermore, lzop occasionally decompresses data even faster than simply copying it! Use lzop on the command line as a filter with the backup tool named tar:
$ tar c a/dir | lzop - > backup.tar.lzo
tar's c option tells it to create one big archive from the files in a/dir. The | is a shell command that automatically pipes tar's output into lzop's input. The - tells lzop to read from its standard input, and the > is a shell command that redirects lzop's output to a file named backup.tar.lzo.
You can restore with:
$ lzop -dc backup.tar.lzo | tar x
The d and c options tell lzop to decompress and write to standard output, respectively. tar's x option tells it to extract the original files from the archive.
Although lzop is impressive, you can get even higher compression ratios—much higher! Here's how. Combine a little-known data compression tool named lzma with tar to increase storage space effectively by 400%. Here's how you would use it to back up:
$ tar c a/dir | lzma -x -s26 > backup.tar.lzma
lzma's -x option tells it to compress more, and its -s option tells it how big of a dictionary to use.
You can restore with:
$ cat backup.tar.lzma | lzma -d | tar x
The -d option tells lzma to decompress. You need patience to increase storage by 400%; lzma takes about 40 times as long as gzip. In other words, that one-hour gzip backup might take all day with lzma.
This version of lzma is the hardest compressor to find. Make sure you get the one that acts as a filter. See Resources for its two locations.
The data compression tool with the best trade-off between speed and compression ratio is rzip. With compression level 0, rzip finishes about 400% faster than gzip and compacts data 70% more. rzip accomplishes this feat by using more working memory. Whereas gzip uses only 32 kilobytes of working memory during compression, rzip can use up to 900 megabytes, but that's okay because memory is getting cheaper and cheaper.
Here's the big but: rzip doesn't work as a filter—yet. Unless your data already is in one file, you temporarily need some extra disk space for a tar archive. If you want a good project to work on that would shake up the Linux world, enhance rzip to work as a filter. Until then, rzip is a particularly good option for squeezing a lot of data onto CDs or DVDs, because it performs well and you can use your hard drive for the temporary tar file.
Here's how to back up with rzip:
$ tar cf dir.tar a/dir $ rzip -0 dir.tar
The -0 option says to use compression level 0. Unless you use rzip's -k option, it automatically deletes the input file, which in this case is the tar archive. Make sure you use -k if you want to keep the original file.
rzipped tar archives can be restored with:
$ rzip -d dir.tar.rz $ tar xf dir.tar
rzip's default compression level is another top performer. It can increase your effective disk space by 375% but in only about a fifth of the time lzma can take. Using it is almost exactly the same as the example above; simply omit compression level -0.
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
|Trying to Tame the Tablet||May 08, 2013|
|Dart: a New Web Programming Experience||May 07, 2013|
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Readers' Choice Awards
- What's the tweeting protocol?
- Developer Poll
- New Products
- Reply to comment | Linux Journal
2 hours 53 min ago
- play with linux? i think you mean work-around linux
11 hours 20 min ago
- Where is Epistle?
11 hours 25 min ago
- You forgot OwnCloud
11 hours 55 min ago
- aplikasi free
15 hours 9 min ago
- Having a framework
15 hours 13 min ago
- Fix my computer
15 hours 53 min ago
20 hours 18 sec ago
- Missed one
20 hours 19 min ago
- web Host
20 hours 28 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.