Hack and / - Lightning Hacks

Instead of one large hack, this month, I cover a few of my favorite smaller hacks to manage windows, switch my display to a projector and perform binary diffs on large files.
What's the Difference?

Recently, I was working on a remastered Knoppix DVD that I had sent out to a few people. After I had sent out the full remastered DVD, I found out that I needed to change a few small files on the DVD. Even though my home DSL speeds are pretty fast, the upload is still slow enough that it took overnight to transfer the 3GB+ DVD image. I didn't want to go through that again, especially as I had made only minor changes to the DVD.

I knew that binary diff tools existed, but I discovered that not all of them are equal. Some binary diff tools require enough RAM to store multiple copies of the file, which certainly wouldn't work with a 3GB image. Lucky for me, I found rdiff, a tool that works well with large files and doesn't require a lot of RAM. What's better is that rdiff works with any binary—you can use it for any large binary files from DVD images to virtual disks to multimedia files.

rdiff works via a three-stage process. In this example, I have two files, old.iso and new.iso, that have minor differences from each other. For the first stage, you create a signature file that rdiff uses to represent your original file:

$ rdiff signature old.iso old.signature

Now that you have a signature file, use it with rdiff to create a delta file that represents the differences between the old and new files:

$ rdiff delta old.signature new.iso new.delta

This new.delta file is now all that anyone needs to convert old.iso to new.iso. For me, this file ended up being around 150Kb, because I had made only a few changes. The delta file was much simpler to send around than the full image. If you want to test that the delta file will work, first create an md5sum of new.iso:

$ md5sum new.iso

Then, use rdiff to patch the old file with the delta to create the new file. This is the same command that everyone else with the original file will use:

$ rdiff patch odl.iso new.delta newtest.iso

Now that you have newtest.iso, create an md5sum of that file and compare it with the one you made for new.iso:

$ md5sum newtest.iso

As I said before, this method works not only with ISOs, but also with any binary file large or small. It's worth noting that rdiff works with the same binary diff method rsync uses. rdiff just lets you use the algorithm step by step on the command line.

Kyle Rankin is a Senior Systems Administrator in the San Francisco Bay Area and the author of a number of books, including Knoppix Hacks and Ubuntu Hacks for O'Reilly Media. He is currently the president of the North Bay Linux Users' Group.


Kyle Rankin is a director of engineering operations in the San Francisco Bay Area, the author of a number of books including DevOps Troubleshooting and The Official Ubuntu Server Book, and is a columnist for Linux Journal.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


moli's picture

thanks for the tip, i was looking for a tool like rdiff

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState