Disk Maintenance under Linux (Disk Recovery)

The ins and outs of disk maintenance—what we all should know and DO.
Enter debugfs

The final utility we will discuss is probably the most powerful and dangerous. With debugfs, you can modify the disk with direct disk writes. Since this utility is so powerful, you will normally want to invoke it as read-only until you are ready to actually make changes and write them to the disk. To invoke debugfs in read-only mode, do not use any switches. To open in read-write mode, add the -w switch. You may also want to include in the command line the device you want to work on, as in /dev/hda1 or /dev/sda1, etc. Once it is invoked, you should see a debugfs prompt.

We'll be looking at only a limited set of commands for the purposes of this article. I would refer you to the man pages, but the page for debugfs located on my system is out of date and does not accurately reflect debugfs' commands. To get a list, if not an explanation, at the debugfs prompt type ?, lr or list_requests.

The first command you can try is params to show the mode (read-only or read-write), and the current file system. If you run this command without opening a file system, it will almost certainly dump core and exit.

Two other commands, open and close, may be of interest if you are checking more than one file system. Close takes no argument, and appropriately enough, it closes the file system that is currently open. Open takes the device name as an argument.

If you wish to see disk statistics from the superblock, the command stats will display the information by group.

Now that you've had a chance to look at a few of debugfs' functions, let's get to work fixing our hard disk. From the printed list of bad blocks, we need to see which blocks are in use and which files are using them. For this we'll use testb with each block number as an argument. If the test says the block is not in use, we know we have'nt lost any data here yet.

If the block is marked as in use, you'll want to find out which file is using this block. We can find the inode by using:

icheck

which will return the inode that points to the block. From here, we can use

ncheck

to get the name of the file corresponding to the inode. Now we finally have something we can work with. You may want to try to save the file, but if the block really is bad, you're probably better off reinstalling this file from a backup disk. To free the block, you can use one of several commands; the one I recommend is:

cleari

This will deallocate the inode and its corresponding blocks. Remember, you'll have to be in read-write mode to do this. Note that these commands are irrevocable in read-write mode.

Once the bad block has been deallocated, you can use:

setb

to permanently allocate the block, removing the inode that points to it from the pool of free inodes.

That's it. Once the appropriate changes have been made to set the blocks, you can quit debugfs and reboot. You should not see more problems unless you missed a block (or have grown more bad blocks).

Summary

Good disk maintenance requires periodic disk checks. Your best tool is fsck, and should be run at least monthly. Default checks will normally be run after 20 system reboots, but if your system stays up for weeks at a time as mine often does, you'll want to force a check from time to time. Your best bet is performing routine system backups and checking your lost+found directories from time to time. The dumpe2fs utility will provide important information regarding hard disk operating parameters found in the superblock, and badblocks will perform surface checking. Finally, surgical procedures to remove areas grown bad on the disk can be accomplished using debugfs.

David Bandel is a Computer Network Consultant specializing in Linux, but he begrudgingly works with Windows and those “real” Unix boxes like DEC 5000s and Suns. When he's not working, he can be found hacking his own system or enjoying the view of Seattle from 2,500 feet up in an airplane. He welcomes your comments, criticisms, witticisms, and will be happy to further obfuscate the issue. You may reach him via e-mail at dbandel@ix.netcom.com or snail mail c/o Linux Journal.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

As long as the hard disk is

Erase Hard Drive's picture

As long as the hard disk is not toasted, completely broken, or data is overwritten, you can retrieve your data. When a file is removed from the hard drive, it is not actually deleted.

Recover linux hard drive

Linux hard drive recovery's picture

To recover the data deleted from linux hard drive I have used the Stellar Phoenix software for Linux data recovery

Linux data recovery software From Stellar Data recovery

Maria's picture

I too have used the software and found it Awesome

A file which iNode is 0

who can help me?'s picture

My system is Linux 5

here is a file: .bash_profile
when I run the command: ls -ai

The result return by shell is :

> ----------------------------------
> cd dmsystem
> ls -ai
> [root@TCJ dmsystem]# ls -ai
> 21200897 . 21200909 data 21200906 .mozilla
> 2 .. 21200898 .emacs 21200994 src
> 21201266 .bash_history 21200944 exe 21200984 .viminfo
> 21200901 .bash_logout 21200903 .kde 21200900 .zshrc
> 0 .bash_profile 21201265 .lesshst
> 21200902 .bashrc 21200986 log
> ----------------------------------

Who can surrport to me some command so that I can delete the file: .bash_profile

Wow, over 10 years old. Good

directhex's picture

Wow, over 10 years old. Good info. Thanks.

Magic Numbers.

Ralph Corderoy's picture

> The EF53 presumably means Extended Filesystem (EF) version and mod number 53. However, I am unclear about the background of the 53.

I've always assumed the `5' is meant to be read as an `S' since hexadecimal doesn't have an S and the intention is Extended File System 3.

Cheers,

Ralph.

The author says: "The file

LucMove's picture

The author says:

"The files in these directories will have the form ./#nnnn, where nnnn is the inode number used as the file name. You may be able to determine what the file is by inspecting it using cat. If cat returns what appears to be garbage, you probably have a binary file. In this case, you can do a chmod +x #nnnn, and then run the file. These procedures should give you enough information to learn what the file is."

But nowadays there is a program called 'file' that is a much better idea. Just run:

$ file #nnnn

... and it will try a very good guess at what is the file's format.

tutorial: Repair hard disk (linux)

daniel's picture

here the link

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix