Archiving and Compression
Expanding a Zip archive isn't hard at all. To create a zipped archive, use the zip command; to expand that archive, use the unzip command.
$ unzip moby.zip Archive: moby.zip inflating: job.txt inflating: moby-dick.txt inflating: paradise_lost.txt
The unzip command helpfully tells you what it's doing as it works. To get even more information, add the -v option (which stands, of course, for verbose).
unzip -v moby.zip Archive: moby.zip Length Method Size Ratio CRC-32 Name ------- ------ ------ ----- ------ ---- 102519 Defl:X 35747 65% fabf86c9 job.txt 1236574 Defl:X 487553 61% 34a8cc3a moby-dick.txt 508925 Defl:X 224004 56% 6abe1d0f paradise_lost.t ------- ------ --- ------- 1848018 747304 60% 3 files
There's quite a bit of useful data here, including the method used to compress the files, the ratio of original to compressed file size, and the cyclic redundancy check (CRC) used for error correction.
Sometimes you might find yourself looking at a Zip file and not remembering what's in that file. Or perhaps you want to make sure that a file you need is contained within that Zip file. To list the contents of a zip file without unzipping it, use the -l option (which stands for "list").
$ unzip -l moby.zip Archive: moby.zip Length Date Time Name -------- ---- ---- ---- 0 01-26-06 18:40 bible/ 207254 01-26-06 18:40 bible/genesis.txt 102519 01-26-06 18:19 bible/job.txt 1236574 01-26-06 18:19 moby-dick.txt 508925 01-26-06 18:19 paradise_lost.txt -------- ------- 2055272 5 files
From these results, you can see that moby.zip contains two files — moby-dick.txt and paradise_lost.txt — and a directory (bible), which itself contains two files, genesis. txt and job.txt. Now you know exactly what will happen when you expand moby.zip. Using the -l command helps prevent inadvertently unzipping a file that spews out 100 files instead of unzipping a directory that contains 100 files. The first leaves you with files strewn pell-mell, while the second is far easier to handle.
Sometimes zipped archives become corrupted. The worst time to discover this is after you've unzipped the archive and deleted it, only to discover that some or even all of the unzipped contents are damaged and won't open. Better to test the archive first before you actually unzip it by using the -t (for test) option.
$ unzip -t moby.zip Archive: moby.zip testing: bible/ OK testing: bible/genesis.txt OK testing: bible/job.txt OK testing: moby-dick.txt OK testing: paradise_lost.txt OK No errors detected in compressed data of moby.zip.
You really should use -t every time you work with a zipped file. It's the smart thing to do, and although it might take some extra time, it's worth it in the end.
Using gzip is a bit easier than zip in some ways. With zip, you need to specify the name of the newly created Zip file or zip won't work; with gzip, though, you can just type the command and the name of the file you want to compress.
$ ls -l -rw-r--r-- scott scott 508925 paradise_lost.txt $ gzip paradise_lost.txt $ ls -l -rw-r--r-- scott scott 224425 paradise_lost.txt.gz
You should be aware of a very big difference between zip and gzip: When you zip a file, zip leaves the original behind so you have both the original and the newly zipped file, but when you gzip a file, you're left with only the new gzipped file. The original is gone.
If you want gzip to leave behind the original file, you need to use the -c (or --stdout or --to-stdout) option, which outputs the results of gzip to the shell, but you need to redirect that output to another file. If you use -c and forget to redirect your output, you get nonsense like this:
Not good. Instead, output to a file.
$ls -l -rw-r--r-- 1 scott scott 508925 paradise_lost.txt $ gzip -c paradise_lost.txt > paradise_lost.txt.gz $ ls -l -rw-r--r-- 1 scott scott 497K paradise_lost.txt -rw-r--r-- 1 scott scott 220K paradise_lost.txt.gz
Much better! Now you have both your original file and the zipped version.
Tip: If you accidentally use the -c option without specifying an output file, just start pressing Ctrl+C several times until gzip stops.
Practical books for the most technical people on the planet. Newly available books include:
- Agile Product Development by Ted Schmidt
- Improve Business Processes with an Enterprise Job Scheduler by Mike Diehl
- Finding Your Way: Mapping Your Network to Improve Manageability by Bill Childers
- DIY Commerce Site by Reven Lerner
Plus many more.
- diff -u: What's New in Kernel Development
- Server Hardening
- What's New in 3D Printing, Part III: the Software
- Giving Silos Their Due
- 22 Years of Linux Journal on One DVD - Now Available
- Controversy at the Linux Foundation
- Don't Burn Your Android Yet
- Firefox OS
- February 2016 Issue of Linux Journal