Linux in the Real World
The first job was a watchdog timer daemon. The watchdog daemon needed to do an I/O port write periodically to reset the hardware watchdog timer. In order to provide for orderly system shutdown, when the watchdog timer daemon receives a term signal, it disables the timer. Since the port address was above 0x400 the ioperm() and _outb() system calls wouldn't work.
(The kernel only maintains permission maps for ports below 0x400.) Instead, the daemon does an open() on /dev/port and uses the lseek() and write() system calls to do the port I/O. Since the I/O is small and infrequent, the system call overhead isn't a problem.
The next job was the software that gathers data via the three serial ports. Should it be a single, large program that talks to all three devices? This would be needlessly complex compared to writing three separate programs, each of which talks to a single device. This is especially true, since the three devices all used different protocols and provided different sets of data.
Each of the three programs gathers data (one sample every five seconds) and writes a line of text on stdout for each sample. Each line of output includes time stamp, status and data values. Each program has a command line option that specifies how long to gather data before terminating.
A simple ASCII output format with whitespace-separated columns allows easy manipulation and data reduction using familiar Unix tools, such as awk and gnuplot. A few lines of one of the data files is shown below:
94-01-28 18:52:41 OK 0 4.745400 998.4952 94-01-28 18:52:47 OK 0 4.745406 998.4937
It's like a cheap speaker phone—only one end can talk at a time.
The only unusual problem associated with the data gathering programs was the use of a half-duplex FSK modem. RTS must be asserted when the Linux host sending a command and then dropped to allow the device to respond. This can't be done easily from user level software, so the serial port driver was modified. Two lines of code were added to the driver so that it asserts RTS at the beginning of a transmission and drops it at the end. You don't often need source to the OS, but in this case, it saved a large amount of extra effort that would have been required to add custom hardware to control RTS.
Once the individual data acquisition programs were debugged, something was needed to execute the individual programs and coordinate the whole process. On Unix systems, that means a shell script: nothing complicated, just an infinite loop that does the following:
start each of the three data acquisition programs in the background with a command line switch set to run for six hours and with stdout redirected into a file.
wait for all three of the above to terminate.
compress the data files and uucp them to the destination.
This shellscript is started in the background by an entry in /etc/rc.local and runs forever shipping data files off four times a day.
It all sounds quite smooth after the fact, but the project was not without its little hiccups. The most embarrassing problem occurred while attempting to reboot the system remotely. I typed shutdown -fh instead of shutdown -fr so the system halted rather than rebooting. The system was down for a week before a trip could be made to the site to push the reset button.
The dial-in/dial-out port connected to the cellular modem would occasionally be permanently locked by getty. This prevented uucp from dialing out to transfer data. A crontab entry was added to periodically kill the getty on that port. There were two other instances where all communications were lost. After some experimentation, I determined that the cellular telephone had somehow been powered off.
Perusal of the user's manual and a call to the service provider revealed that the cellular phone shuts itself off if not used for eight hours. This happened twice—apparently the cellular connection doesn't always detect the modem off-hook condition, and this resulted in the cellular telephone turning itself off after eight hours of inactivity. UUCP should have retried several times before the eight hour timeout, so the exact sequence of events is still a bit of a mystery. The immediate solution was to configure a uucp crontab entry to make sure it will “phone home” once an hour even if there is not any work to be done. The eight hour timeout can be disabled and this will be done when it is convenient to take the phone in to the shop.
On a more mundane note, I managed to break my shared libraries the first time I attempted to upgrade them in order to run a newer version of the “man” utility. It was a simple task to boot using the bootdisk/CD-ROM and fix the libraries to the point where the system would again boot from the IDE drive. (When you upgrade your shared libraries, read the directions twice before you start and follow them exactly.)
- March 2015 Issue of Linux Journal: System Administration
- High-Availability Storage with HA-LVM
- DNSMasq, the Pint-Sized Super Dæmon!
- Localhost DNS Cache
- Real-Time Rogue Wireless Access Point Detection with the Raspberry Pi
- Days Between Dates: the Counting
- The Usability of GNOME
- You're the Boss with UBOS
- Multitenant Sites
- Linux for Astronomers