Linux in the Real World

Linux has several convenient strong points when working with non-standard hardware. Freely available source code allowed Grant Edwards to compete a project much more easily than he would have been able to without Linux.
Application Software

The first job was a watchdog timer daemon. The watchdog daemon needed to do an I/O port write periodically to reset the hardware watchdog timer. In order to provide for orderly system shutdown, when the watchdog timer daemon receives a term signal, it disables the timer. Since the port address was above 0x400 the ioperm() and _outb() system calls wouldn't work.

(The kernel only maintains permission maps for ports below 0x400.) Instead, the daemon does an open() on /dev/port and uses the lseek() and write() system calls to do the port I/O. Since the I/O is small and infrequent, the system call overhead isn't a problem.

The next job was the software that gathers data via the three serial ports. Should it be a single, large program that talks to all three devices? This would be needlessly complex compared to writing three separate programs, each of which talks to a single device. This is especially true, since the three devices all used different protocols and provided different sets of data.

Each of the three programs gathers data (one sample every five seconds) and writes a line of text on stdout for each sample. Each line of output includes time stamp, status and data values. Each program has a command line option that specifies how long to gather data before terminating.

A simple ASCII output format with whitespace-separated columns allows easy manipulation and data reduction using familiar Unix tools, such as awk and gnuplot. A few lines of one of the data files is shown below:

94-01-28 18:52:41 OK 0 4.745400 998.4952
94-01-28 18:52:47 OK 0 4.745406 998.4937
Half Duplex Communication

It's like a cheap speaker phone—only one end can talk at a time.

The only unusual problem associated with the data gathering programs was the use of a half-duplex FSK modem. RTS must be asserted when the Linux host sending a command and then dropped to allow the device to respond. This can't be done easily from user level software, so the serial port driver was modified. Two lines of code were added to the driver so that it asserts RTS at the beginning of a transmission and drops it at the end. You don't often need source to the OS, but in this case, it saved a large amount of extra effort that would have been required to add custom hardware to control RTS.

Running the Show

Once the individual data acquisition programs were debugged, something was needed to execute the individual programs and coordinate the whole process. On Unix systems, that means a shell script: nothing complicated, just an infinite loop that does the following:

  • start each of the three data acquisition programs in the background with a command line switch set to run for six hours and with stdout redirected into a file.

  • wait for all three of the above to terminate.

  • compress the data files and uucp them to the destination.

This shellscript is started in the background by an entry in /etc/rc.local and runs forever shipping data files off four times a day.

Glitches and Problems

It all sounds quite smooth after the fact, but the project was not without its little hiccups. The most embarrassing problem occurred while attempting to reboot the system remotely. I typed shutdown -fh instead of shutdown -fr so the system halted rather than rebooting. The system was down for a week before a trip could be made to the site to push the reset button.

The dial-in/dial-out port connected to the cellular modem would occasionally be permanently locked by getty. This prevented uucp from dialing out to transfer data. A crontab entry was added to periodically kill the getty on that port. There were two other instances where all communications were lost. After some experimentation, I determined that the cellular telephone had somehow been powered off.

Perusal of the user's manual and a call to the service provider revealed that the cellular phone shuts itself off if not used for eight hours. This happened twice—apparently the cellular connection doesn't always detect the modem off-hook condition, and this resulted in the cellular telephone turning itself off after eight hours of inactivity. UUCP should have retried several times before the eight hour timeout, so the exact sequence of events is still a bit of a mystery. The immediate solution was to configure a uucp crontab entry to make sure it will “phone home” once an hour even if there is not any work to be done. The eight hour timeout can be disabled and this will be done when it is convenient to take the phone in to the shop.

On a more mundane note, I managed to break my shared libraries the first time I attempted to upgrade them in order to run a newer version of the “man” utility. It was a simple task to boot using the bootdisk/CD-ROM and fix the libraries to the point where the system would again boot from the IDE drive. (When you upgrade your shared libraries, read the directions twice before you start and follow them exactly.)


Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
On Demand
Moderated by Linux Journal Contributor Mike Diehl

Sign up and watch now

Sponsored by Skybot