In this month's column, Eric moves beyond find to cover duplicating files and directory trees using the versatile cpio command. cpio uses space on tape more efficiently than tar and is an excellent alternative for creating archives on platforms that do not have the GNU utilities available. Read on for a thorough discussion of cpio and its three modes of operation: Pass-through, Create and Extract.

The cpio command may seem cryptic at first glance, but after you use it a few times, it will become an indispensable addition to your Linux toolkit. Especially if you are one of the many users with no tape drive and no commercial backup utility, learning cpio and swapping floppies sure beats the (non-existent) alternative after a disk crashes or you make a mistake with the rm command...

Eric Goebelbecker (eric@interramp.com) is a systems analyst for Reuters America, Inc. He supports clients (mostly financial institutions) who use market data retrieval and manipulation APIs in trading rooms and back office operations. In his spare time (about 15 minutes a week...), he reads about philosophy and hacks around with Linux.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Useful article

Anonymous's picture

Just discovered this article while trying to diagnose some cpio problems... no surprises, it turned out to be the non-writable directories issue discussed...

Meaning of -depth option backwards, use -print0 and -0 options!

John Keith Hohm's picture

The -depth option to find ensures that directory names are output after the names of the files in them, not before. In combination with the --make-directories (or just -d) and --preserve-modification-times (or just -p) options to cpio, this results in cpio preserving the original modification time of both files and directories.

This works because cpio will create a directory automatically while writing the files inside it; only after it is done writing all the directory contents does it visit the directory itself to set its attributes, which includes resetting the modification time.

You are missing a couple other important options, though: the -print0 option to find and the --null (or just -0) option to cpio cause find and cpio to write and read the list of filenames terminated by a null character instead of a newline. Since most Linux filesystems allow names to contain nulls, this is important to properly archive such files and avoids doing something very bad with a file named like:


So the full recommended command is:

$ find . -depth -print0 | cpio -pdm0v dest_dir

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState