Archiving CDs to ISO from the Command Line

A few weeks ago I was working on a PC when I needed to grab the motherboard driver CD.  In a perfect world, the CD would be located in a nice protective sleeve, safely kept away from the nasty elements that encompass the IT tech area (read: coffee, scratches, and the occasional jelly doughnut).  But in this case, it appeared someone had taken this CD and wiped it across a Brillo pad.  I'm sure you have all had this problem from time to time, heck, my toddler kids tend to use them as Frisbees around the house when they find my stash of CDs.

But alas, I wasn't worried about this minor setback.  Why? Because when I get new hardware that comes with a CD one of the first things I do is burn an ISO image of the CD/DVD and put a copy of the ISO image on one of my servers.  This effectively rules out the chance of destroying a CD with drivers that I might need later on.

Some people may argue that you can always go online and get the drivers, that's true.  But if you need older drivers that are no longer available, or drivers for hardware that's been bought out by another company, it can become a royal pain in the butt trying to track down the software.  If you have copies of the CD/DVD on an ISO folder, it's as easy as burning a new CD or DVD.

Now, before I get started, I'm going to start with the customary disclaimer and get all that nasty but necessary legal mumbo jumbo out of the way.  This might not be 100% necessary, but hey, it covers mine and LinuxJournal's butt in the long run so I might as well get it over with.

Depending upon your State/Country/Planet/Solar System copying a CD or DVD may be against state/federal/planetary regulations.  Not only that, but it might be in violation of the software agreement's End User License Agreement (EULA). Myself and LinuxJournal are not responsible if you decide to burn CD/DVDs into ISO images to take over the world from your mothers basement. Please abide by state/country/planetary/EULA regulations before making ISO images.

With that said, I personally don't see any harm in creating an ISO image of driver/software CDs for archiving purposes as long as said ISO image is not given away or sold to anyone else.  I don't share my personal ISO images and never will.

In this blog post, I'm going to show you how not not make coasters out of CDs.  What a lot of people don't realize is that it's not as simple as a dd if= of= and go about your merry way. In order to make a proper burnable ISO image you need to take blocksize and blockcount into account.  Not only that, once the ISO image is complete, you really need to compare the MD5 hash of the CD against the ISO image itself.  I'll be going into detail about each one and provide a nice script that I whipped up for this article.

Why dd if= of= is a bad idea

A standard dd if= of= image write can be good in certain situations when it is necessary, but not when it comes to writing CDs and DVDs to ISO images.  Do a google.com search sometime for "linux make an ISO image" and a dozen searches come up where people recommend just using dd if=/dev/sd0 of=/pathto/file.iso.

Now I'm not saying the internet is full of bad information, granted maybe this worked out for someone at one point and they passed it off to someone else.  Then someone blogged it, and the circle repeats itself.  In any case, if you want a proper ISO image of that CD you need to get the blocksize and blockcount correct before you create your image.  When the CD was originally created it had a logical block size associated with it.  For the most part I have usually seen 1024 and 2048.  The other thing to look at is the block size, otherwise known as the volume size.  This is the amount of data stored on the CD.  We pass both of this information onto dd when creating the CD in order to tell the dd application the proper blocksize and blockcount to write.

If you want to follow along and see where I'm getting this information from, find the location of your cdrom (check /etc/fstab but it's usually is linked to /dev/cdrom) and run the following command from the commandline:

isoinfo -d -i /dev/cdrom

This command will scan your cd and output the necessary information.  As you can see, it outputs the blocksize and block count necessary for burning the CD. If you feel like skipping the rest of this post and burning coasterless CDs than you can stop now and use the following command:

dd if=/dev/cdrom bs=blocksize count=count
of=/path/to/isoimage.iso

Obviously replace blocksize and count with your blocksize and count collected from isoinfo.

MD5sum

You know that phrase? The one your teacher probably drilled into you as a school child? The proverbial "Don't count your chickens before they hatch"? Well in this case, we do want to count our chickens before they hatch.  Before you make an ISO image of a CD and file the CD away for good you want to make sure the ISO image you created is a good, clean copy.  This is where MD5 hashs come into play.  I'm not going to go into great detail about MD5, but if you haven't looked into checking MD5 hashs against downloaded files, now's the time to open up a linuxjournal.com search and check out some articles.  In this case, we are going to check the MD5 hash of the CDROM against the ISO image that you may have created in the previous step (if you're following along.  If not, the script provided at the end of this post will do this for you).

So, if you have already created an ISO image with the above command, let's check that MD5 hash.  With the CD loaded run the following command:

dd if=/dev/cdrom bs=blocksize count=count | md5sum

This will spit out a 128-bit cryptographic hash based on the contents of the CD.  Now let's check it against the ISO image you generated with the following command:

cat imagename.iso | md5sum

The output should match the MD5 sum generated above.  If they match, then you can rest assured that the ISO image that you generated will be good enough to burn CDs from.  If the MD5 sum doesn't match than make sure that you entered the correct information in from isoinfo into dd and try again.

Script

The script itself will be pasted at the bottom of this post, but what I will quickly touch on is what the script does.  I typed this script up specifically for this blog post.  Usually I just run isoinfo, grab my blocksize and count, and make the CD then check the md5sum and go from there.  Why haven't I created a script yet?  We will call it professional laziness. I'm sure I will be using this script from now on though.  What this script will do is allow you to find out the physical path of your CDROM, specify a path to your ISO image, and check the MD5 sum against the CDROM.  I did take a portion of the script from Troubleshooters.com's 'Coasterless CD Burning' while working on this script, the url will be pasted in the comments section of the script if you wish to look into the website further.

Note: This is a 'demonstration script'.  Bugs might come crawling out of your screen and up our pants leg.  It works IF you have a CD in the tray and you have the proper HAL package.  Feel free to modify the script to your needs, or pick it apart and use it as you see fit.  I have described all of the commands above in case you wish to not use the script.

Conclusion

Well there you have it, how to archive CDs into an ISO image.  As I said earlier, I tend to take any new CD/DVD out there and create ISO images for archiving purposes.  You never know when someone will scratch that one of a kind HP Utilities CD containing the Array Diagnostic Utilities that you need to run at 2am in the morning against an old server.  But if you have an ISO image of that CD all you have to do is create a CD and away you go.  Once you have an
ISO image, it's as simple as burning a CD; of course that's a blog post for another time. :-)  

#!/bin/bash

## ArchiveCD.sh Script whipped up for LinuxJournal.com Blog
## Post on Archiving CD's to ISO Images.  Written by Jayson
## Broughton.  Script updates may be found at the following
## website: www.jaysonbroughton.com
##
## blocksize and blockcount variables taken from Steve Litt's
## script on Troubleshooters.com article 'Coasterless CD
## burning.
## URL: http://www.troubleshooters.com/linux/coasterless.htm
##
## Last Updated: 05/15/2011

## Check HAL for CDrom and grab UDI
UDI=`hal-find-by-capability --capability storage.cdrom`

## Run UDI against block device
device=`hal-get-property --udi $UDI --key block.device`

## Get Block size of CD
blocksize=`isoinfo -d -i $device | grep "^Logical block size is:" | cut -d " " -f 5`
if test "$blocksize" = ""; then
        echo catdevice FATAL ERROR: Blank blocksize >&2
        exit
fi

## Get Block count of CD
blockcount=`isoinfo -d -i $device | grep "^Volume size is:" | cut -d " " -f 4`
if test "$blockcount" = ""; then
        echo catdevice FATAL ERROR: Blank blockcount >&2
        exit
fi

usage()
{
cat <<EOF

usage: $0 options
-h      Show this message
-d      Report the Location of your Device
-m      Check your MD5Hash of CD against Image (Run AFTER making Image)
-l      Location and name of ISO Image (/path/to/image.iso)
-r      Rip CD to ISO image
I'm Lazy, I didn't build much error checking into this script So alas, here's how to run it. Anything else might break the script.

Example 1: Report location of drive
archiveCD.sh -d

Example 2: Rip a CD to ISO
archiveCD.sh -l /path/to/isoimage.iso -r

Example 3: Check MD5Hash (Run AFTER ripping CD to ISO)
archiveCD.sh -l /path/to/isoimage.iso -m


EOF
}



while getopts "hdml:r" OPTION; do
  case $OPTION in
    h)
      usage
      exit 1
       ;;
    d)
      echo "Your CDrom is located on: $device" >&2
      ;;
    m)
      echo "Checking MD5Sum of CD and New ISO Image"
      md5cd=`dd if=$device bs=$blocksize count=$blockcount | md5sum` >&2
      md5iso=`cat $LFLAG | md5sum` >&2
      echo "CD MD5 is:" $md5cd
      echo "ISO MD5 is:" $md5iso
      ;;
    l)
     LFLAG="$OPTARG"
      ;;
    r)
     dd if=$device bs=$blocksize count=$blockcount of=$LFLAG
     echo "Archiving Complete.  ISO Image located at:"$LFLAG
      ;;
  esac
done
 
Load Disqus comments