Compression Tools Compared
Data compression also can speed up network transfers. How much depends on how fast your CPU and network are. Slow networks with fast CPUs can be sped up the most by thoroughly compressing the data. Alternatively, slow CPUs with fast connections do best with no compression.
Find the best compressor and compression level for your hardware in the graph shown in Figure 6. This graph's CPU and network speed axes are scaled logarithmically too. Look where your CPU and network speeds intersect in the graph, and try the data compression tool and compression level at that point. It also should give you a sense of how much your bandwidth may increase.
Network Transfer Estimates
To find the best compressors for various CPU and network speeds, I considered how long it takes to compress data, send it and decompress it. I projected how long compression and decompression should take on computers of various speeds by simply scaling actual test results from my 1.7GHz CPU. For example, a 3.4GHz CPU should compress data about twice as fast. Likewise, I estimated transfer times by dividing the size of the compressed data by the network's real speed.
The overall transfer time for non-filtering data compression tools, such as rzip, simply should be about the sum of the estimated times to compress, send and decompress the data.
However, compressors that can act as filters, such as gzip, have an advantage. They simultaneously can compress, transfer and decompress. I assumed their overall transfer times are dominated by the slowest of the three steps. I verified some estimates by timing real transfers.
For example, if you have a 56Kbps dial-up modem and a 3GHz CPU, their speeds intersect in the light-yellow region labeled lzma 26 at the top of the graph. This corresponds to using lzma with a 226 size dictionary. The graph predicts a 430% increase in effective bandwidth.
On the other hand, if you have a 1GHz network, but only a 100MHz CPU, it should be faster simply to send the raw uncompressed data. This is depicted in the flat black region at the bottom of the graph.
Don't assume that you always should increase performance the most by using lzma, however. The best compression tool for data transfers depends on the ratio of your particular CPU's speed to your particular network's speed.
If the sending and receiving computers have different CPU speeds, try looking up the sending computer's speed in the graph. Compression can be much more CPU-intensive. Check whether the data compression tool and scp are installed on both computers. Remember to replace email@example.com and file with the real names.
For the fastest CPUs and/or slowest network connections that fall in the graph's light-yellow region, speed up your network transfers like this:
$ cat file \ | lzma -x -s26 \ | ssh firstname.lastname@example.org "lzma -d > file"
ssh stands for secure shell. It's a safe way to execute commands on remote computers. This may speed up your network transfer by more than 400%.
For fast CPUs and/or slow networks that fall into the graph's dark-yellow zone, use rzip with a compression level of one. Because rzip doesn't work as a filter, you need temporary space for the compressed file on the originating box:
$ rzip -1 -k file $ scp file.rz email@example.com: $ ssh firstname.lastname@example.org "rzip -d file.rz"
The -1 tells rzip to use compression level 1, and the -k tells it to keep its input file. Remember to use a : at the end of the scp command.
rzipped network transfers can be 375% faster. That one-hour transfer might finish in only 16 minutes!
For slightly slower CPUs and/or faster networks that fall in the graph's orange region, try using gzip with compression level 1. Here's how:
$ gzip -1c file | ssh email@example.com "gzip -d > file"
It might double your effective bandwidth. -1c tells gzip to use compression level 1 and write to standard output, and -d tells it to decompress.
For fast network connections and slow CPUs falling in the graph's blue region, quickly compress a little with lzop at compression level 1:
$ lzop -1c file | ssh firstname.lastname@example.org "lzop -d > file"
Practical books for the most technical people on the planet. Newly available books include:
- Agile Product Development by Ted Schmidt
- Improve Business Processes with an Enterprise Job Scheduler by Mike Diehl
- Finding Your Way: Mapping Your Network to Improve Manageability by Bill Childers
- DIY Commerce Site by Reven Lerner
Plus many more.