Compression Tools Compared

Use top-performing but little-known lossless data compression tools to increase your storage and bandwidth by up to 400%.

Data compression works so well that popular backup and networking tools have some built in. Linux offers more than a dozen compression tools to choose from, and most of them let you pick a compression level too. To find out which perform best, I benchmarked 87 combinations of tools and levels. Read this article to learn which compressor is a hundred times faster than the others and which ones compress the most.

The most popular data compression tool for Linux is gzip, which lets you choose a compression level from one to nine. One is fast, and nine compresses well. Choosing a good trade-off between speed and compression ratio becomes important when it takes hours to handle gigabytes of data. You can get a sense of what your choices are from the graph shown in Figure 1. The fastest choices are on the left, and the highest compressing ones are on the top. The best all-around performers are presented in the graph's upper left-hand corner.

Figure 1. Increasing the compression level in gzip increases both compression ratio and time required to complete.

But many other data compression tools are available to choose from in Linux. See the comprehensive compression and decompression benchmarks in Figures 2 and 3. As with gzip, the best performers are in the upper left-hand corner, but these charts' time axes are scaled logarithmically to accommodate huge differences in how fast they work.

Figure 2. Performance of Many Utilities, Compression

Figure 3. Performance of Many Utilities, Decompression

Better Backups

The tools that tend to compress more and faster are singled out in the graphs shown in Figures 4 and 5. Use these for backups to disk drives. Remember, their time axes are scaled logarithmically. The red lines show the top-performing ones, and the green lines show the top performers that also can act as filters.

Figure 4. Best Utilities for Backups, Compression

Figure 5. Best Utilities for Backups, Decompression

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

How to improve backups?

Paolo Subiaco's picture

Hi. Congratulations for this very useful article... I use a script for backup made by myself which use tar +gzip... switching to tar - lsop backup time takes less than half time, increasing the backup size by about 25%.
An idea to improve speed is to replace tar with a more intelligent tool.
Infact, tar simply "cat all files to stdout" and then gzip or lsop compress this huge stream of data, but some data is already compressed (images, movies, open document files) and don't need to be recompressed!
The idea is to have an archiver (like tar) which compress each file by itself, storing the original file in case of images, movies, archives, already compressed files.
Is there any tool that can do this, and save all priviledges (owner, group, mode) associated to each file like tar does?
Thank you. Paolo

(1) Found a typo: "On the

Anonymous's picture

(1) Found a typo:
"On the other hand, if you have a 1GHz network, but only a 100MHz CPU"

1 GHz network? Should maybe be 1 Gbps.

(2) Suggestion:
Multi-Core CPUs are the big thing today, compression tools that could utilise multiple cores can run 2, 4 or soon even 8 times faster on "normal" desktop PCs...not even speaking of the servers...which compression tools can utilise this CPU power?

multi-core CPU support

zmi's picture

Multi-Core CPUs are the big thing today, compression tools that could utilise multiple cores can run 2, 4 or soon even 8 times faster on "normal" desktop PCs...not even speaking of the servers...which compression tools can utilise this CPU power?

http://compression.ca/pbzip2/
There's parallel bzip2, very good but not pipe support.

HTH,
mfg zmi

Very nice information

Bharat's picture

Very nice information provided.Thanks!!!

Excelent article.

Eduardo Diaz's picture

Thanks very much for this article. I really enjoyed it, and will be helpfull for my daily work.

1. how about another part

Anonymous's picture

how about another part with specific data - like 90+% text? for mysql dumps & dbmail scenarios etc.

and 45MB does not sound as sufficient test data size for rzip to test it's speed.

Compression on Windows

Werner Bergmans's picture

First of all excellent test!.

Believe it or not, but compression is one of those application types where all research takes place on Windows Pc's. The last couple of years there were some major breakthroughs in compression caused by the new PAQ context modeling algorithms. Have a look at this site for some results. Programs like gzip, rzip 7-zip and lzop are tested here too, so it should be easy to compare results.
http://www.maximumcompression.com/

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix