Disaster Recovery
Something has gone wrong. That's all you know. Staring at your blank or garbage-ridden screen, the only thing you can think is “Now what do I do?” Even if you have not had this happen yet, there is probably a good chance you will face this. With all of Linux's power, it is still rather easy for a new—or even experienced—user to make a mistake and mess up something.
With some advance preparation, this kind of situation won't leave you stranded. Make sure you know how to track down a problem, have a bootable disk, and have a set of rescue disks, configured for your particular setup.
Your first step is tracking down the problem. Do you get to the `Uncompressing Linux...' message? If not, your problem is with the boot disk or LILO. Having a spare boot disk should allow you to boot your system, and then you can reconfigure LILO or make a new boot disk.
While Linux is booting, do you get past the partition check? If so, your hard drives are probably fine with Linux. I had a hard drive once that made Linux hang when it tried to find the partitions. The drive didn't work in any other system I tested, so the drive was bad.
Also, if you get past the partition check, then the kernel is not your problem. After the partition checks are done, root is mounted and then /etc/inittab is read. As you may or may not recall, /etc/inittab is used by the init program to start login processes and begins reading your /etc/rc files to mount your partitions, start your network among other things. Once the inittab is read, it goes to the corresponding file for mounting additional filesystems, starting network services, and other startup services. If you see your filesystems being mounted, that means that some of your rc files are being started.
Once the inittab is read, it goes to the corresponding startup file (“rc file”) for mounting additional filesystems, starting network services, and other startup services. If you see your filesystems being mounted, that means that some of your rc files are being started.
Finally, make sure that your network services are starting if you want them started on your system. This is one of the final parts to the startup sequence.
Now, what do you do if you know you have a problem? Before you get into a jam, make sure you have backups. If things get too bad you can always re-initialize your partition and restore from an old backup. Also make sure to have backups handy of your /etc directory.
One good idea is to get a copy of the rescue disks available through FTP. These disks will allow you to boot linux from a pair of floppies and access most of your partitions. This way, even if you can't boot because of a bad /etc/inittab file, you can still boot linux and get access to the bad file, then fix it.
Some of these rescue disks come completely ready-made, so that you can use the rescue disks very easily. The disadvantage to these sets is that they may use an older kernel, may not have some pieces that you need (SCSI support, for example), and may not have the set of programs that you want to see in a rescue disk.
There are other sets of rescue disks where you specify which programs you want to include. They also use the current version of the kernel that you are using. The drawbacks to these are that you need to know what you are doing and they take a bit more work than simply getting a pre-built rescue disk. Two such packages are SAR (Search and Rescue) and rescue. Each of these packages is small, as they both use programs that are already on your system.
If you have two floppy drives, you can go through the rescue disk(s) and find out what programs that you'd like to add, such as your favorite editor. Usually one disk can contain all the programs you'd need in the event of a disaster, but having two disks chock full of utilities will be even better. Here's how:
First, put a floppy in your second drive. I have a 5.25 HD drive as my second floppy, so I'll use that in my examples.
The fdformat program is used to low-level format a floppy. Its syntax is:
fdformat <device>
where <device> is the name and type of drive you're using. For example, I have a high density 5.25" drive as drive 2, so my <device> would be /dev/fd1h1200. A high density 3.25" would be /dev/fd1H1440.
Now you put a filesystem on it. Use the same filesystem that you are using on the root partition of your system. In my case, that would be the Second Extended Filesystem (ext2). So, let's put a filesystem on my floppy:
mke2fs -c /dev/fd1h1200
Replace the /dev/hd1h1200 with /dev/fd1H1440 if your second drive is a 3.5" high density drive.
Now you should have a filesystem on a disk. Mount it on an unused directory. The /mnt directory is usually used for this. If /mnt does not exist on your system, do
mkdir /mnt then do mount -t ext2 /dev/fd1 /mnt
Your disk will now be mounted on /mnt. At this point, start copying over whatever programs you want. Make sure of two things:
Make sure that the shared libraries on the rescue disk will work with the programs that you put on the disk.
Make sure that you copy over all the files you need. Some editors have configuration files or help files you may need.
If you are using a rescue disk such as SAR or rescue, you won't need to worry about libraries and you can skip ahead a few paragraphs. Or you can read it and get a better hint about how the shared libraries work.
The idea behind shared libraries is that many common C functions get included in one file in a common location. This saves a lot of space as those common functions no longer need to be duplicated in each program binary. The drawback is that it is a tiny bit slower because now two files have to be loaded instead of one. For the toss-up between speed and size, I'll take the size, especially on a floppy with very limited space.
Another small problem with shared libraries is that programs compiled to use a new library won't work if the only library that is available is an older one. For example, a program compiled to use version 4.4 of the libraries won't work if the only set of libraries available is version 4.3. You'll wind up getting an error message about incompatible libraries. If this happens, get a new copy of the libraries or recompile the program to use an older library.
[Ed. Note: this is not strictly true. With modern libraries, the user will get a message, but the program will still try to run if all the necessary symbols are there. For instance, I'm running some binaries compiled under libc 4.5.8 which run fine with my libc 4.4.4, other than giving an error message. I don't know if you want to deal with this or not; probably not.]
To check what versions of libraries the programs are looking for, use the ldd command:
ldd <program>
This will return the version of libraries that the program was compiled under. ldd /bin/write for me returns:
libc.so.4 (DLL Jump 4.4pl1)
If the files in the /lib directory are libc.so.4.4.1 or above, it will be fine to put the `write' command on your disk. If the library needed is newer than the library on the rescue disk, then you would need to find an older version of the program and put that on the floppy. For example, if the library on the rescue disk was libc.so.4.3.1, I'd need to find an older version of write to put on the disk, or else put libc.so.4.4.1 on the disk.
You don't need to put just executables on this disk. A copy of gzip and a bunch of HOWTO files can come in quite handy as well. Here's a list of suggested files, all available through FTP or on many BBSs. Some of these files may be on the rescue disk you have. Make sure.
Take any of these editors. I find that ed is small and compact, but not much fun with heavy editing or large files. For you, joe may be worth the extra 98k it takes up. If you are unfamiliar with joe or ed, you can use vi, which is a standard program on just about all UNIX systems:joe editor 133kvi editor 101ked editor 35k
General Everyday Utilities:diff 61k (finds changes in big files)grep 61kgzip 46klilo 40kMAKEDEV 9kmknod 3k
Backup utilities:This will vary depending on how you did your backup. You may want a copy of tar, afio and ftape. Get some utilities for the filesystems you run:e2fsck 35kmke2fs 20k
Get some HOWTO files (compress with gzip for real space savings!):Installation-HOWTO 48kSCSI-HOWTO 41kFtape-HOWTO 18k
One more thing you'll want on-hand is a list of all of the cards that are in your machine, the IRQs that they use, and whether they are used by Linux or not. Sometimes a problem can be an incorrectly configured kernel or card.
If you keep these disks set aside and updated often, you'll be ready for anything that might happen.
Tip of the month: When you hit the backspace, do you see /'s followed by the character you just backspaced over? Don't you hate it, too? It reminds me of reading The Unix Programming Environment. Get a new copy of agetty and this should cure the problem. A copy distributed with some Slackware releases had this problem.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Home, My Backup Data Center
- What's the tweeting protocol?
- Readers' Choice Awards
- New Products
- RSS Feeds
- Linux on Azure—a Strange Place to Find a Penguin
- Reply to comment | Linux Journal
9 hours 38 min ago - Reply to comment | Linux Journal
12 hours 11 min ago - Reply to comment | Linux Journal
13 hours 28 min ago - great post
14 hours 3 min ago - Google Docs
14 hours 26 min ago - Reply to comment | Linux Journal
19 hours 14 min ago - Reply to comment | Linux Journal
20 hours 1 min ago - Web Hosting IQ
21 hours 35 min ago - Thanks for taking the time to
23 hours 11 min ago - Linux is good
1 day 1 hour ago
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




Comments
Disaster Recovery
I notice that Disaster Recovery is becoming a big thing.