Archiving and Compression
-r
If you want to use gzip on several files in a directory, just use a wildcard. You might not end up gzipping everything you think you will, however, as this example shows.
$ ls -F bible/ moby-dick.txt paradise_lost.txt $ ls -l * -rw-r--r-- scott scott 1236574 moby-dick.txt -rw-r--r-- scott scott 508925 paradise_lost.txt bible: -rw-r--r-- scott scott 207254 genesis.txt -rw-r--r-- scott scott 102519 job.txt $ gzip * gzip: bible is a directory -- ignored $ ls -l * -rw-r--r-- scott scott 489609 moby-dick.txt.gz -rw-r--r-- scott scott 224425 paradise_lost.txt.gz bible: -rw-r--r-- scott scott 207254 genesis.txt -rw-r--r-- scott scott 102519 job.txt
Notice that the wildcard didn't do anything for the files inside the bible directory because gzip by default doesn't walk down into subdirectories. To get that behavior, you need to use the -r (or --recursive) option along with your wildcard.
$ ls -F bible/ moby-dick.txt paradise_lost.txt $ ls -l * -rw-r--r-- scott scott 1236574 moby-dick.txt -rw-r--r-- scott scott 508925 paradise_lost.txt bible: -rw-r--r-- scott scott 207254 genesis.txt -rw-r--r-- scott scott 102519 job.txt $ gzip -r * $ ls -l * -rw-r--r-- scott scott 489609 moby-dick.txt.gz -rw-r--r-- scott scott 224425 paradise_lost.txt.gz bible: -rw-r--r-- scott scott 62114 genesis.txt.gz -rw-r--r-- scott scott 35984 job.txt.gz
This time, every file — even those in subdirectories — was gzipped. However, note that each file is individually gzipped. The gzip command cannot combine all the files into one big file, like you can with the zip command. To do that, you need to incorporate tar, as you'll see in "Archive and Compress Files with tar and gzip."
-[0-9]
Just as with zip, it's possible to adjust the level of compression that gzip uses when it does its job. The gzip command uses a scale from 0 to 9, in which 0 means "no compression at all" (which is like tar, as you'll see later), 1 means "do the job quickly, but don't bother compressing very much," and 9 means "compress the heck out of the files, and I don't mind waiting a bit longer to get the job done." The default is 6, but modern computers are fast enough that it's probably just fine to use 9 all the time.
$ ls -l -rw-r--r-- scott scott 1236574 moby-dick.txt $ gzip -c -1 moby-dick.txt > moby-dick.txt.gz $ ls -l -rw-r--r-- scott scott 1236574 moby-dick.txt -rw-r--r-- scott scott 571005 moby-dick.txt.gz $ gzip -c -9 moby-dick.txt > moby-dick.txt.gz $ ls -l -rw-r--r-- scott scott 1236574 moby-dick.txt -rw-r--r-- scott scott 487585 moby-dick.txt.gz
Remember to use the -c option and pipe the output into the actual .gz file due to the way gzip works, as discussed in "Archive and Compress Files Using gzip."
Note - If you want to be clever, define an alias in your .bashrc file that looks like this:
alias gzip='gzip -9'
That way, you'll always use -9 and won't have to think about it.
gunzip
Getting files out of a gzipped archive is easy with the gunzip command.
$ ls -l -rw-r--r-- scott scott 224425 paradise_lost.txt.gz $ gunzip paradise_lost.txt.gz $ ls -l -rw-r--r-- scott scott 508925 paradise_lost.txt
In the same way that gzip removes the original file, leaving you solely with the gzipped result, gunzip removes the .gz file, leaving you with the final gunzipped result. If you want to ensure that you have both, you need to use the -c option (or --stdout or --to-stdout) and pipe the results to the file you want to create.
$ ls -l -rw-r--r-- scott scott 224425 paradise_lost.txt.gz $ gunzip -c paradise_lost.txt.gz > paradise_lost.txt $ ls -l -rw-r--r-- scott scott 508925 paradise_lost.txt -rw-r--r-- scott scott 224425 paradise_lost.txt.gz
It's probably a good idea to use -c, especially if you plan to keep behind the .gz file or pass it along to someone else. Sure, you could use gzip and create your own archive, but why go to the extra work?
Note - If you don't like the gunzip command, you can also use gzip -d (or --decompress or --uncompress).
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- New Products
- Trying to Tame the Tablet
- Paranoid Penguin - Building a Secure Squid Web Proxy, Part IV
- Developer Poll
- Looking Good
2 hours 8 min ago - Hey God - You may not be
6 hours 21 min ago - Reply to comment | Linux Journal
8 hours 54 min ago - Drupal is an Awesome CMS and a Crappy development framework
13 hours 33 min ago - IT industry leaders
15 hours 56 min ago - Reply to comment | Linux Journal
1 day 8 hours ago - Reply to comment | Linux Journal
1 day 11 hours ago - Reply to comment | Linux Journal
1 day 12 hours ago - great post
1 day 13 hours ago - Google Docs
1 day 13 hours ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
Unzipping Password Protected Zips
You left out how to unzip ZIP files that are password protected in Linux. I'm searching for this elusive bit of information on the internet right now...
Password protectedly adding files by PHP code was not found
Password protectedly adding files by PHP code was not found on the internet when i was searching for it... so i come across your article and it gave me the idea to why not issue a system command by php to add files in zip and even protect the files by password ;)
RAR
RAR is good and free too. It supports passwords and can make SFX archives.
No mention of lzma?
How about rzip or lzma? I recall an article in the print edition within the last ten or eleven issues that compared the cpu overhead of each compression method against compression ratios (and possibly other parameters). Anyways, rzip is memory and cpu intensive, IIRC, but has the potential to make enormous savings. I think it's the same as burrows-wheeler over larger data sets, possibly. Worthwhile for stuff that won't be frequently decompressed, IMO.
rzip
actually rzip levels are in search buffer sizes:
-0 = 100MB
-1 = 100MB
-x = x00MB for x>0 and x<=9
cpu intensive? well depends. I hacked bzip2 compression hooks out of the rzip and it's one of the fastest pre archiving filters with best compression ratio for mysql dump of dbmail database.
yup found bug but only in decompression algorithm - not the data itself. yes - made Andrew to fix it.
Correction to wording
Scott,
In the section "Archive Files with tar", paragraph 3, you state that tar is "designed to compress entire directory structures". I think this should read "designed to archive...", since this section deals only with tar's standalone use as an archival tool and since this article/chapter is intended to highlight the difference between archiving and compressing. Other than that, this is a very handy primer on archiving and compressing in *nix.
bzip2 -9
The article states that the default block size for bzip2 is -6. The man page for my system (Ubuntu 6.06) states that -9 is the default, and I am unaware of any system where -6 is the default.
TROGDOR STRIKES AGAIN!
TROGDOR STRIKES AGAIN!
http://news.bbc.co.uk/1/hi/england/cornwall/6088008.stm
Making -9 the default
An easier way to default to the best (-9) compression level would be to export GZIP='-9' and ZIPOPTS='-9' into your environment.