unzip
As much as we all love Linux, it is nevertheless true that occasionally we must force ourselves to deal with the DOS/MS-Windows world, however indirectly. For some of us that involves having a dual-boot system (perhaps via LILO—the LInux LOader—or OS/2's Boot Manager), but even those of us who manage to avoid that fate will sooner or later come across files that originated on some flavor of DOS or Windows system. More than likely, a few of those files will end in .zip—and that's where the unzip command comes in.
unzip is a free utility to process zipfiles, as these things are generally called. Zipfiles are actually archives of one or more other files, almost always compressed to save disk space and/or transmission time. In this regard they are similar to compressed tar archives, which are those files usually ending in .tar.Z, .tar.gz or .tgz that one finds on most Linux ftp sites and many CD-ROM distributions. One major difference between zip files and tar archives: compressed tar archives bundle all of the files together and then compress the result as a single entity; zipfiles compress individual files, then store them in the archive. This zip file method isn't quite as efficient in achieving the maximal overall compression, but it does allow you to list the archive's contents and to extract individual files without decompressing the whole mess.
How does one actually use unzip to list an archive's contents? The simplest way is with the -l option (for “list”):
$ unzip -l quake92p.zip
Archive: quake92p.zip
Length Date Time Name
------ ---- ---- ----
36064 06-25-96 13:18 DEICE.EXE
369135 06-27-96 03:51 QUAKE92P.1
2618 06-27-96 03:34 README.TXT
177 06-25-96 20:07 INSTALL.BAT
206 06-27-96 03:54 QUAKE92P.DAT
------ -------
408200 5 files
You have each file's name (on the right), its uncompressed size, and the date and time of its last modification. For many of us, however, especially those long steeped in the terse intricacies of ls, this is a little too short and sweet. For fans of ls, or for anyone wishing to know more about the details of the archive, unzip has an entire mode devoted to listing both useful and obscure zipfile information: zipinfo mode, triggered via the -Z option. (On some systems the zipinfo command exists as a link to unzip and is synonymous with unzip -Z, but this is not true of Slackware distributions as of this writing.) We'll limit ourselves to a description of the default zipinfo listing format:
$ unzip -Z quake92p.zip Archive: quake92p.zip 406075 bytes 5 files -rwxa-- 2.0 fat 36064 b- defN 25-Jun-96 13:18 DEICE.EXE -rw-a-- 2.0 fat 369135 b- stor 27-Jun-96 03:51 QUAKE92P.1 -rw-a-- 2.0 fat 2618 t- defN 27-Jun-96 03:34 README.TXT -rwxa-- 2.0 fat 177 t- defN 25-Jun-96 20:07 INSTALL.BAT -rw-a-- 2.0 fat 206 t- defN 27-Jun-96 03:54 QUAKE92P.DAT 5 files, 408200 bytes uncompressed, 405569 bytes compressed: 0.6%
You will immediately recognize a certain resemblance to the output of ls -l. The header line gives the archive name, its total size, and the total number of files in it; the trailer gives the number of files listed (in this case all of them), the total uncompressed and compressed data size of the listed files (not counting internal zipfile headers), and the compression ratio. Here the ratio is quite poor, mostly due to the fact that the largest file (QUAKE92P.1) is stored without any compression. In the leftmost column are the file permissions. The next column indicates the version of the archiver, and the one after that is what tells us the files came from the FAT (DOS) file system. Next are the uncompressed file size and a column indicating which files are most likely to be binary and which are probably text. The next three columns note the compression method used on each file; the time stamps; and the full file names.
Now that we know what files we have, how do we actually get the files out? File extraction is as simple as typing unzip and the file name:
$unzip quake92p Archive: quake92p.zip inflating: DEICE.EXE extracting: QUAKE92P.1 inflating: README.TXT inflating: INSTALL.BAT inflating: QUAKE92P.DAT
Here we've omitted the .zip suffix; unzip first looks for the file quake92p and, not finding it, checks for quake92p.zip instead. What if we wanted only the README.TXT file? No problem. Anything (well, almost anything) after the zipfile name is taken to be the name of one of the enclosed files:
$unzip quake92p README.TXT Archive: quake92p.zip inflating: README.TXT
Here you may notice a little snag. If you now edit this file in Linux with an editor like vi, you'll see what looks like ^M at the end of each and every line. Or, if you view the file with a pager like more, you'll discover that any line uncovered by the --More-- prompt gets erased immediately. These problems are due to the fact that DOS and its successors store text files with two end-of-line characters, CR and LF (a.k.a. carriage return and linefeed, respectively, or ^M and ^J, or CTRL-M and CTRL-J), rather than the more efficient single character (LF) used on all Unix systems. So when a Unix utility—like an editor or a pager or a compiler—looks at a DOS text file, it may behave a little oddly or die altogether.
Fortunately there's a simple solution: unzip's -a option. Originally a mnemonic for ASCII conversion, the option these days is used for all sorts of text-file conversions. As a single-letter option it does its best to automatically convert files that are supposedly text, while leaving alone those that are marked binary. Be careful! zip and PKZIP don't always guess correctly when creating the archive, particularly for certain classes of MS-Windows files, and unzip's “text” conversions are almost always irreversible. In other words, don't extract with auto-conversion and then delete the original zipfile without first making sure everything is Okay. unzip does indicate which files it thinks are text when auto-converting, however:
$ unzip -a quake92p Archive: quake92p.zip inflating: DEICE.EXE [binary] extracting: QUAKE92P.1 [binary] inflating: README.TXT [text] inflating: INSTALL.BAT [text] inflating: QUAKE92P.DAT [text]
In this case everything worked as intended. If, for some reason, zip marked a text file as binary and you want to force text conversion, simply double the option: -aa.
But wait, there's more! The discriminating Linux user, happily accustomed to a file system that not only preserves the case of file names but also distinguishes between names differing only in case, is not going to settle for a bunch of all uppercase DOS file names in his or her directories. Enter the -L option. If (and only if) the file came from a single case file system like DOS FAT or VMS, unzip -L will convert it to lowercase upon extraction, thusly:
$ unzip -aL quake92p Archive: quake92p.zip inflating: deice.exe [binary] extracting: quake92p.1 [binary] inflating: readme.txt [text] inflating: install.bat [text] inflating: quake92p.dat [text]
Isn't that nice?
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
| Speed Up Your Web Site with Varnish | Jun 19, 2013 |
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Linux Systems Administrator
- Senior Perl Developer
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Technical Support Rep
- RSS Feeds
- Non-Linux FOSS: libnotify, OS X Style
- UX Designer
- Web & UI Developer (JavaScript & j Query)
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




1 hour 45 min ago
2 hours 8 min ago
2 hours 30 min ago
2 hours 34 min ago
5 hours 20 min ago
5 hours 37 min ago
6 hours 54 min ago
7 hours 42 min ago
7 hours 45 min ago
7 hours 54 min ago