Disk Maintenance under Linux (Disk Recovery)
The final utility we will discuss is probably the most powerful and dangerous. With debugfs, you can modify the disk with direct disk writes. Since this utility is so powerful, you will normally want to invoke it as read-only until you are ready to actually make changes and write them to the disk. To invoke debugfs in read-only mode, do not use any switches. To open in read-write mode, add the -w switch. You may also want to include in the command line the device you want to work on, as in /dev/hda1 or /dev/sda1, etc. Once it is invoked, you should see a debugfs prompt.
We'll be looking at only a limited set of commands for the purposes of this article. I would refer you to the man pages, but the page for debugfs located on my system is out of date and does not accurately reflect debugfs' commands. To get a list, if not an explanation, at the debugfs prompt type ?, lr or list_requests.
The first command you can try is params to show the mode (read-only or read-write), and the current file system. If you run this command without opening a file system, it will almost certainly dump core and exit.
Two other commands, open and close, may be of interest if you are checking more than one file system. Close takes no argument, and appropriately enough, it closes the file system that is currently open. Open takes the device name as an argument.
If you wish to see disk statistics from the superblock, the command stats will display the information by group.
Now that you've had a chance to look at a few of debugfs' functions, let's get to work fixing our hard disk. From the printed list of bad blocks, we need to see which blocks are in use and which files are using them. For this we'll use testb with each block number as an argument. If the test says the block is not in use, we know we have'nt lost any data here yet.
If the block is marked as in use, you'll want to find out which file is using this block. We can find the inode by using:
which will return the inode that points to the block. From here, we can use
to get the name of the file corresponding to the inode. Now we finally have something we can work with. You may want to try to save the file, but if the block really is bad, you're probably better off reinstalling this file from a backup disk. To free the block, you can use one of several commands; the one I recommend is:
This will deallocate the inode and its corresponding blocks. Remember, you'll have to be in read-write mode to do this. Note that these commands are irrevocable in read-write mode.
Once the bad block has been deallocated, you can use:
to permanently allocate the block, removing the inode that points to it from the pool of free inodes.
That's it. Once the appropriate changes have been made to set the blocks, you can quit debugfs and reboot. You should not see more problems unless you missed a block (or have grown more bad blocks).
Good disk maintenance requires periodic disk checks. Your best tool is fsck, and should be run at least monthly. Default checks will normally be run after 20 system reboots, but if your system stays up for weeks at a time as mine often does, you'll want to force a check from time to time. Your best bet is performing routine system backups and checking your lost+found directories from time to time. The dumpe2fs utility will provide important information regarding hard disk operating parameters found in the superblock, and badblocks will perform surface checking. Finally, surgical procedures to remove areas grown bad on the disk can be accomplished using debugfs.
David Bandel is a Computer Network Consultant specializing in Linux, but he begrudgingly works with Windows and those “real” Unix boxes like DEC 5000s and Suns. When he's not working, he can be found hacking his own system or enjoying the view of Seattle from 2,500 feet up in an airplane. He welcomes your comments, criticisms, witticisms, and will be happy to further obfuscate the issue. You may reach him via e-mail at email@example.com or snail mail c/o Linux Journal.
- Open-Source Space
- Silicon Mechanics Gives Back
- Numerical Python
- Reglue: Opening Up the World to Deserving Kids, One Linux Computer at a Time
- New Storage Solution is Music to the Ears of Fast-Growing Digital Music Company
- Our Assignment
- Talking to Twitter
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Quantum Cryptography
- Linux Systems Administrator