cpio

In this month's column, Eric moves beyond find to cover duplicating files and directory trees using the versatile cpio command. cpio uses space on tape more efficiently than tar and is an excellent alternative for creating archives on platforms that do not have the GNU utilities available. Read on for a thorough discussion of cpio and its three modes of operation: Pass-through, Create and Extract.
Summary

The cpio command may seem cryptic at first glance, but after you use it a few times, it will become an indispensable addition to your Linux toolkit. Especially if you are one of the many users with no tape drive and no commercial backup utility, learning cpio and swapping floppies sure beats the (non-existent) alternative after a disk crashes or you make a mistake with the rm command...

Eric Goebelbecker (eric@interramp.com) is a systems analyst for Reuters America, Inc. He supports clients (mostly financial institutions) who use market data retrieval and manipulation APIs in trading rooms and back office operations. In his spare time (about 15 minutes a week...), he reads about philosophy and hacks around with Linux.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Useful article

Anonymous's picture

Just discovered this article while trying to diagnose some cpio problems... no surprises, it turned out to be the non-writable directories issue discussed...

Meaning of -depth option backwards, use -print0 and -0 options!

John Keith Hohm's picture

The -depth option to find ensures that directory names are output after the names of the files in them, not before. In combination with the --make-directories (or just -d) and --preserve-modification-times (or just -p) options to cpio, this results in cpio preserving the original modification time of both files and directories.

This works because cpio will create a directory automatically while writing the files inside it; only after it is done writing all the directory contents does it visit the directory itself to set its attributes, which includes resetting the modification time.

You are missing a couple other important options, though: the -print0 option to find and the --null (or just -0) option to cpio cause find and cpio to write and read the list of filenames terminated by a null character instead of a newline. Since most Linux filesystems allow names to contain nulls, this is important to properly archive such files and avoids doing something very bad with a file named like:

byebye<newline>/etc/passwd

So the full recommended command is:

$ find . -depth -print0 | cpio -pdm0v dest_dir

Webcast
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers

Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.

Learn More

Sponsored by AMD

White Paper
Red Hat White Paper: Using an Open Source Framework to Catch the Bad Guy

Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6

Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.

Learn more about catching the bad guy in this free white paper.

Learn More

Sponsored by DLT Solutions