Disk Maintenance under Linux (Disk Recovery)

The ins and outs of disk maintenance—what we all should know and DO.

Here's a hypothetical situation for you to think about. You're working on your Linux box, calling up an application or data file, and Linux hesitates while reading the hard disk. Then, scrolling up the screen (or console box), you see something like this:

Seek error accessing /dev/hdb2 at block 52146,
    IDE reset (successful).

After some time spent chugging away accessing the drive, Linux continues. If you're lucky, everything is still running along fine. If you're not, your program is refusing to start, or your data file contains garbage.

Chances are, if you're using a hard disk drive that's a few years old, you will begin to see errors when accessing the disk from time to time. At this point, the best prognosis for your disk is that, given time, it'll get worse. So you need to begin resuscitation efforts as soon as possible. Several disk manufacturers have utilities that find and allocate these bad sectors on your hard disk. Unfortunately, these utilities also destroy the information on your disk, and are normally run from DOS, not Linux.

Fortunately, Linux has some system utilities to help you when you are dealing with its (now) native ext2 format. (Utilities are also available for minix. If you need to repair other non-Linux file systems you should use their own native sets of file system utilities.) While not as user-friendly as Norton Disk Doctor or Microsoft ScanDisk, the Linux disk and file system utilities get the job done. In this article, we'll look at a few of the tools to help us overcome the kind of problem I described in the opening paragraph. Other hard disk manipulation utilities can be found in /sbin and /usr/sbin, but they'll have to wait. For now, let's get the hard disk working properly.

Before you dig in, if you're using one of the newer 2.0.x kernels with an IDE drive, check to see if you have the proper bug fixes compiled into the kernel. If you aren't sure which chipset you have in your computer or are unable to ascertain for sure, it is safe to compile in the CMD640, RZ1000 and Intel 82371 options. These options are found under Floppy, ID and other block devices in your make config. This could save your data in the future. These bug fixes may be all you need, but further checks on your hard drive won't hurt.

I hate cliches, although I'm frequently accused of (ab)using them. If it really went without saying that we always do system backups, my income might be somewhat lower than it is. For most people, it's just not true. So, if you've neglected the chore for a while, just let me say that now would be a good time to do that backup. Some of the work I'll be telling you how to do could inadvertently damage or destroy your file system or some of your important files—so be careful and don't say I didn't warn you.

Preparations

Now that I've gotten the requisite legal protection warnings out up front, let's begin. The safest way to start is with a fairly mundane check of the file system. On my system—a combination Red Hat (I like the SYSVinit style bootup), Slackware, Internet tarball concoction—I have fsck, a front-end program that reads the type of file system on a device (from /etc/fstab), then invokes the appropriate fsck.filesystemtype checker—in my case, fsck.ext2. You may have e2fsck on your system instead of, or in addition to, fsck.ext2. Don't worry, they're the same file. One may be a soft link to the other, but it's better to make that a hard link.

Before starting, let's prepare our systems for the kind of work we're going to be doing. Whenever I perform low-level maintenance on a system, I find it prudent to ensure I am disconnected from the network. Normally this means dropping to single-user mode. You may opt to do some of these tests from init level 2 (with no network connections), but you'll want to ensure that you don't have too many processes running that want to write to the disk, and none that run from the partition you need to work on. Single-user mode was made for this. A simple telinit 1 will get us to single-user mode.

If you're not checking the root file system, unmount the file system you're going to work on before you begin. If you forget, you'll get a prompt from fsck telling you the file system is mounted and asking if you want to continue anyway. Say “No”--running low-level system diagnostics, particularly those that alter the file system by writing directly to the disk as fsck does, with the disk mounted, is a very bad idea. Obviously, we can't unmount the root file system. We should be able to remount it as read-only, but a bug in mount doesn't always allow this option. If you need to check the root file system, you can reboot into single-user mode with the root partition mounted read-only by issuing the -b switch at the LILO prompt. The -b switch will be passed through LILO to init and will cause an emergency boot that does not run any of the startup scripts. If you have always wondered why you would want to create several partitions—for example, for /usr and /home--and restrict the size and scope of the root partition, now you know.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

As long as the hard disk is

Erase Hard Drive's picture

As long as the hard disk is not toasted, completely broken, or data is overwritten, you can retrieve your data. When a file is removed from the hard drive, it is not actually deleted.

Recover linux hard drive

Linux hard drive recovery's picture

To recover the data deleted from linux hard drive I have used the Stellar Phoenix software for Linux data recovery

Linux data recovery software From Stellar Data recovery

Maria's picture

I too have used the software and found it Awesome

A file which iNode is 0

who can help me?'s picture

My system is Linux 5

here is a file: .bash_profile
when I run the command: ls -ai

The result return by shell is :

> ----------------------------------
> cd dmsystem
> ls -ai
> [root@TCJ dmsystem]# ls -ai
> 21200897 . 21200909 data 21200906 .mozilla
> 2 .. 21200898 .emacs 21200994 src
> 21201266 .bash_history 21200944 exe 21200984 .viminfo
> 21200901 .bash_logout 21200903 .kde 21200900 .zshrc
> 0 .bash_profile 21201265 .lesshst
> 21200902 .bashrc 21200986 log
> ----------------------------------

Who can surrport to me some command so that I can delete the file: .bash_profile

Wow, over 10 years old. Good

directhex's picture

Wow, over 10 years old. Good info. Thanks.

Magic Numbers.

Ralph Corderoy's picture

> The EF53 presumably means Extended Filesystem (EF) version and mod number 53. However, I am unclear about the background of the 53.

I've always assumed the `5' is meant to be read as an `S' since hexadecimal doesn't have an S and the intention is Extended File System 3.

Cheers,

Ralph.

The author says: "The file

LucMove's picture

The author says:

"The files in these directories will have the form ./#nnnn, where nnnn is the inode number used as the file name. You may be able to determine what the file is by inspecting it using cat. If cat returns what appears to be garbage, you probably have a binary file. In this case, you can do a chmod +x #nnnn, and then run the file. These procedures should give you enough information to learn what the file is."

But nowadays there is a program called 'file' that is a much better idea. Just run:

$ file #nnnn

... and it will try a very good guess at what is the file's format.

tutorial: Repair hard disk (linux)

daniel's picture

here the link

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix