Hack and / - When Disaster Strikes: Hard Drive Crashes
To make things a little confusing, there are two similar tools with almost identical names. dd_rescue (with an underscore) is an older rescue tool that still does the job, but it works in a fairly basic manner. It starts at the beginning of the drive, and when it encounters errors, it retries a number of times and then moves to the next block. Eventually (usually after a few days), it reaches the end of the drive. Often bad blocks are clustered together, and in the case when all of the bad blocks are near the beginning of the drive, you could waste a lot of time trying to read them instead of recovering all of the good blocks.
The ddrescue tool (no underscore) is part of the GNU Project and takes the basic algorithm of dd_rescue further. ddrescue tries to recover all of the good data from the device first and then divides and conquers the remaining bad blocks until it has tried to recover the entire drive. Another added feature of ddrescue is that it optionally can maintain a log file of what it already has recovered, so you can stop the program and then resume later right where you left off. This is useful when you believe ddrescue has recovered the bulk of the good data. You can stop the program and make a copy of the mostly complete image, so you can attempt to repair it, and then start ddrescue again to complete the image.
The first thing you will need when creating an image of your failed drive is another drive of equal or greater size to store the image. If you plan to use the second drive as a replacement, you probably will want to image directly from one device to the next. However, if you just want to mount the image and recover particular files, or want to store the image on an already-formatted partition or want to recover from another computer, you likely will create the image as a file. If you do want to image to a file, your job will be simpler if you image one partition from the drive at a time. That way, it will be easier to mount and fsck the image later.
The ddrescue program is available as a package (ddrescue in Debian and Ubuntu), or you can download and install it from the project page. Note that if you are trying to recover the main disk of a system, you clearly will need to recover either using a second system or find a rescue disk that has ddrescue or can install it live (Knoppix fits the bill, for instance).
Once ddrescue is installed, it is relatively simple to run. The first argument is the device you want to image. The second argument is the device or file to which you want to image. The optional third argument is the path to a log file ddrescue can maintain so that it can resume. For our example, let's say I have a failing hard drive at /dev/sda and have mounted a large partition to store the image at /mnt/recovery/. I would run the following command to rescue the first partition on /dev/sda:
$ sudo ddrescue /dev/sda1 /mnt/recovery/sda1_image.img /mnt/recovery/logfile Press Ctrl-C to interrupt Initial status (read from logfile) rescued: 0 B, errsize: 0 B, errors: 0 Current status rescued: 349372 kB, errsize: 0 B, current rate: 19398 kB/s ipos: 349372 kB, errors: 0, average rate: 16162 kB/s opos: 349372 kB
Note that you need to run ddrescue with root privileges. Also notice that I specified /dev/sda1 as the source device, as I wanted to image to a file. If I were going to output to another hard drive device (like /dev/sdb), I would have specified /dev/sda instead. If there were more than one partition on this drive that I wanted to recover, I would repeat this command for each partition and save each as its own image.
As you can see, a great thing about ddrescue is that it gives you constantly updating output, so you can gauge your progress as you rescue the partition. In fact, in some circumstances, I prefer using ddrescue over dd for regular imaging as well, just for the progress output. Having constant progress output additionally is useful when considering how long it can take to rescue a failing drive. In some circumstances, it even can take a few days, depending on the size of the drive, so it's good to know how far along you are.
Once you have a complete image of your drive or partition, the next step is to repair the filesystem. Presumably, there were bad blocks and areas that ddrescue could not recover, so the goal here is to attempt to repair enough of the filesystem so you at least can mount it. Now, if you had imaged to another hard drive, you would run the fsck against individual partitions on the drive. In my case, I created an image file, so I can run fsck directly against the file:
$ sudo fsck -y /mnt/recovery/sda1_image.img
I'm assuming I will encounter errors on the filesystem, so I added the -y option, which will make fsck go ahead and attempt to repair all of the errors without prompting me.
Kyle Rankin is a systems architect; and the author of DevOps Troubleshooting, The Official Ubuntu Server Book, Knoppix Hacks, Knoppix Pocket Reference, Linux Multimedia Hacks, and Ubuntu Hacks.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- RSS Feeds
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- What's the tweeting protocol?
- Tech Tip: Really Simple HTTP Server with Python
- Kernel Problem
22 min 49 sec ago - BASH script to log IPs on public web server
4 hours 49 min ago - DynDNS
8 hours 25 min ago - Reply to comment | Linux Journal
8 hours 58 min ago - All the articles you talked
11 hours 21 min ago - All the articles you talked
11 hours 24 min ago - All the articles you talked
11 hours 26 min ago - myip
15 hours 50 min ago - Keeping track of IP address
17 hours 41 min ago - Roll your own dynamic dns
22 hours 55 min ago




Comments
Partially unrecoverable allocation table
Tried it all but my rescued image does not show any file under /home/lopo although they are there. On an 92GB partition on less than 700KB are damaged but it was enough to lost everything ;(
Foremost does not reconigzed a lot of my files: bz2, gz, svg, odf files, etc., so -t all is not really ALL.
dd progress report with USR1 signal
While I have no quibble with using ddrescue instead of dd (the wonderful thing about *nix is that there are 72 different ways to do anything) I do have to comment on your statement: "In fact, in some circumstances, I prefer using ddrescue over dd for regular imaging as well, just for the progress output."
dd provides progress info if you send a USR1 to the process. From the dd(1) man page:
Sending a USR1 signal to a running `dd' process makes it
print I/O statistics to standard error and then resume copying.
I guess I'm lucky
I guess I'm lucky then.
I've never had a hard drive fail on me. Ever. Not at work, not at home.
I still keep my first PC around, with its fully functional hard drive in it (a 80486 80MHz with a 528 MB hard drive, from 1994 I think).
A Better 'Method of Last Resort'
If you can't mount a ddrescue image, but need to retrieve documents, photographs, pdf files, ect. you can use a nifty program called "foremost." It is available for most *nix platforms, and on Windoze via cygwin. Foremost scans through a hard drive image, mountable or not, and looks for recognizable file headers. It understands over twenty popular file headers including jpg, pdf, doc, xml, etc. When it finds these files it dumps them out as usable files. It is truly a thing of beauty. The first time you use it, you will just sit back in amazement. For example, if you have a hard drive image named my_hd_image.dd, that was made with one of the dd utilities, you could execute the following command:
~$ foremost -t all -i my_hd_image.dd
After the command executes, a subdirectory will be created that has all of the recovered files, organized neatly by file type. On Ubuntu, you can get foremost by typing:
~$ sudo apt-get install foremost
This tools is also excellent for recovering deleted files from USB drives, etc. Enjoy!