System Information Retrieval
If a disk failure were to obliterate one of my machines, the collected system-administration information would help me reload Linux with a minimum of confusion and difficulty on the replacement disk.
If loyd's disk failed, for example, I would replace the disk and restore Linux with these steps:
Reconstruct the partitions from information in /admin/loyd/fdisk.
Rebuild the kernel from the information in the file /admin/loyd/kernel.config.
File /admin/loyd/lilo.conf contains the information that the line append="cdu31a=0x340,5" is necessary for the proper operation of loyd's ancient CD-ROM drive.
There are, of course, as many deviations from these steps as there are users of Linux, but the point of showing the steps is to demonstrate how the collected information is useful in restoring a Linux system.
Although the ability to recover from catastrophic errors was the initial impetus for creating the collect script, the collected data has a number of other uses as well.
Recently I needed to add an IBM Token Ring Network 16/4 Adapter to barb. This adapter only works with interrupt request (IRQ) 2, 3 or 7, so I examined the /admin/barb/interrupt file and determined that IRQ 3 was unused. Since I had collected this information remotely and stored it on cuthroat, I established that barb had an available IRQ without a trip to barb's location and without logging on to barb. In fact, since barb's information was stored on cuthroat, I could have located an unused interrupt for barb even if barb were off-line.
Suppose I need to inventory some software or hardware component in each of the various Linux systems. Let's use networking cards as an example:
cd /admin egrep -i "ne2000|3c|ibm tr" \ `find . -name interrupts -print`
The egrep command will search the interrupts file in each Linux system's directory for ne2000 (the NE2000 clones), 3c (3Com cards), or ibm tr (IBM Token Ring cards) and print all matching lines in each file.
Several months ago I configured the Enhanced Real Time Clock (RTC) support into loyd's kernel. Or was it speed's kernel? Could I have configured RTC support into both kernels? Here's how to tell which kernels have RTC support:
cd /admin grep CONFIG_RTC=y \ `find . -name kernel.config -print`
In a fraction of a second, grep confirms that only loyd has RTC support:
The cuthroat machine has a PC DOS partition. Recently I booted DOS on cuthroat to configure the autoexec.bat and config.sys files so that I could use cuthroat's CD-ROM under DOS. The instructions told me to take one action, if the CD-ROM were controlled by IRQ 14, and to take a completely different action, if the CD-ROM were controlled by IRQ 15. Being efficient (or lazy) I didn't want to turn off cuthroat, rip it open, determine where the CD-ROM cable plugged into the IDE controller, reassemble it and turn it on again.
After pondering a bit, the answer occurred to me: look at a copy of cuthroat's /proc/interrupt file which was stored on loyd. I didn't even have to boot Linux on cuthroat. I used a DOS FTP client to transfer loyd's /admin/cuthroat/interrupts file to the DOS system on cuthroat. Here are the two relevant lines from that file:
14: 9663 + ide0 15: 32 + ide1
IRQ 14 is the first IDE device; at the time the collect script obtained cuthroat's system-administration information, there had been 9,663 interrupts on this device. During the same interval, the second IDE device, attached to IRQ 15, had generated only 32 interrupts. Since I knew cuthroat had only two IDE devices, it was obvious from the interrupt count that the hard drive was attached to IRQ 14 and the CD-ROM was attached to IRQ 15.
As a final example, let's find all the Pentium processors with Intel's infamous floating-point-division bug:
cd /admin grep fdiv_bug `find . -name cpuinfo -print`\ | grep yes
If the Pentium chip in “solo” had the floating-point-division bug, then grep would produce the following output:
./solo/cpuinfo:fdiv_bug : yes