United Railway Signal Group, Inc

The story of how Progressive Computer Concepts has turned United Railway into a Linux shop.

I remember being amazed and somewhat impressed in September of 1995 when I made my first trip to the main offices of United Railway Signal Group, Inc. in Jacksonville, Florida. I was impressed that their computer network worked at all, much less that they actually got quite a bit of work done using it. United had the longest single run of 10Base-2 I had ever seen, connecting approximately 30 computers to two Novell Netware servers that lived in a halon fire-protected room at one end of the offices.

The primary server was Netware 3.12 on an Intel P90 with 64MB of RAM, three Adaptec 2940 SCSI controllers, two one gigabyte SCSI hard disks and a 20 cartridge, 26GB Maxoptix magneto-optical jukebox. One of the 1GB disks was exported from both IPX/SPX and Netware NFS as “ursgpub”, a shared network file system containing shared data and a collection of custom FoxPro applications to track such information as project flow and time sheets.

The magneto-optical jukebox was being used at about 80% capacity. Corel SCSI for Netware controlled the jukebox and used one of the 1GB disks as its “cache volume”. A DOS TSR was needed on the client side for the PCs to see the jukebox as a contiguous 26GB file system. United's Unix CAD stations needed access to data on the jukebox, as did the PC CAD stations and office PCs. Netware NFS had no knowledge of the jukebox, as the jukebox was handled entirely by the Corel SCSI NLMs. In order to allow the Unix CAD stations some limited access to that data, the jukebox's “cache volume” had to be NFS exported. NFS exporting of the cache volume from underneath Corel SCSI proved to be the source of many problems and a constant headache.

The other server, an Intel P75 with 32MB of RAM, ran Netware 4.1. Its sole responsibility was to do backups. ArcServe, the package United used for backups, could never be made to run on the Netware 3.12 server. Thus, United's only option was to purchase a 60-user license Netware 4.1 server for ArcServe and to install the WinAgent client software on each PC. ArcServe never seemed to run properly on this backup server either, crashing almost nightly with out of memory errors or hanging when a client PC's WinAgent software hung.

In mid-October of 1995 we, Progressive Computer Concepts, connected United's main office to the Internet via a dedicated ISDN line using an Ascend Pipeline 50. We installed Linux 1.2.13 on an Intel P90 with 32MB of RAM, a BusLogic 956C and a 1GB SCSI disk to handle DNS, e-mail, WWW and FTP service.

From my time spent at United performing this work, it became obvious to me that the Netware solution was falling apart. The Netware server aborted about once per week (and checking the jukebox's file systems on reboot took hours). Nightly backups via ArcServe failed in some manner almost daily, and hours were wasted each day by United's Netware administrator manually reviewing logs and checking the contents of tapes. I began expressing my opinion that Linux would be a better solution to United's CEO, Mike Wilson.

Over time, the reliability of the one Linux server that handled the Internet operations became more and more clear. I would e-mail Mike an uptime report every 30 days or so. The pivotal point in moving United away from Netware to Linux came when United's Netware system administrator resigned. The door was now opened for us to step in.

Where We Went

In June 1996 we began the formidable task of moving United's entire operation from Netware to Linux. The first to go was the backup server—a Linux boot disk and about an hour turned that little P75 into United's first Linux file server named ursgfs2. We immediately installed SAMBA and smbfs and began writing backup scripts. After removing WinAgent and setting up an administrative share on each PC, we did a full floor backup to the 4mm tape drives in the new Linux server that very night; the backups completed through every machine without error.

We purchased three BusLogic 958 SCSI controllers and four 4GB fast wide SCSI II Quantum Atlas drives in external cases and attached them to the P75 Linux server, ursgfs2. Now, we needed a way to move many gigabytes of CAD files and corporate data off the 26GB magneto-optical jukebox connected to the Netware server, at that time accessible only through a DOS machine using Netware drivers plus a TSR program. We tried various unsuccessful methods.

All of our attempts on DOS clients failed with out of memory errors while trying to pkzip or xcopy files from the jukebox cartridges. The Corel TSR would load under NT but would crash and die at random points during the copy process. We never got NT to successfully copy a single cartridge. Using ncpmount we could mount the jukebox from the Netware server under Linux, but without the TSR the Netware server would kick us off within 60 seconds. The solution was DOSEMU. DOSEMU, when installed on ursgfs2 allowed us to run the Corel TSR, attach to the jukebox on the Netware server and then copy directly to the attached 16GB of new disk space using the xcopy command.

Due to an inefficiency in the FAT (finessed automatic transfer) file system, the FAT tables on the jukebox cartridges were filling long before the cartridges were actually full of data. We were able to store all of the jukebox data onto 16GB of disk space. After the transfer of all of the jukebox data was complete, we blew Netware off the larger server, moved ursgfs2's SCSI cards and disks to the new hardware, and renamed the server ursgfs1. A fifth disk was added a short time later.

The new Linux file server used SAMBA for exporting to PCs and NFS for Unix workstations. The server had three 4mm tape drives. Our backup scripts used smbmount to mount each PC in the building and archive it on tape using tar. Soon, an eight-port Lantronix 10/100 switch was installed, and ursgfs1 was moved to a dedicated 100MB port.

United has a number of CAD stations in each of its offices, where CAD operators work each day on various engineering projects. While using the Netware/ArcServe system, each night ArcServe would copy CAD files from a specific directory hierarchy on each CAD station, in-turn, to the read-only CAD file hierarchy on the Netware server. The potential existed for two different projects inside United to involve the same CAD files. This situation is particularly dangerous when the two CAD operators involved are unaware that they are both working on the same set of files. Under the old Netware/ArcServe system the fact that two CAD operators had been working on the same CAD files, simultaneously, could only be detected by a human and was often not discovered for days or weeks. Much redo CAD work (recadding) would have to be done when those situations were discovered.

To solve this problem, our next software project for United was to write a custom file retrieval and archive commit program that would be, in effect, a revision control system. Every night, the working directories of each CAD station are copied by ursgfs1 to scratch space. Files are then put through a number of sanity checks to detect duplicate works-in-progress, verify file revision and time stamps, file sizes, etc. Files passing all criteria are copied into the read-only CAD file hierarchy; currently, existing files replaced by this process are put into a daily incremental backup. Sixty days' worth of these incrementals are kept in mid-line storage, and any version of any file can be rolled back if needed. Summary reports of committed files and rejected files (if any) are e-mailed to the administrators and to a hypermail archive each night. This system is written entirely in Perl 5 and has been in place, working successfully, since October 25, 1996. The system has recently been expanded to include United's Omaha office.

After the back-up work was completed, we began developing custom Intranet applications for United. We replaced most legacy FoxPro LAN applications with more fully featured and more tightly integrated Intranet programs. The Intranet system, the URSG Daily Operations Control System as it is called, is written entirely in Perl 5. URSGDOCS originally used MiniSQL as its back-end database, but has been ported to and using MySQL for many months now. Once all of the legacy applications like time-sheet entry and project management had been replaced by the Intranet system, we upgraded United's main office Internet connection to 1.536MB T1. United's remote offices in San Francisco, Omaha and Jacksonville (the manufacturing facility) immediately began using URSGDOCS via dial-up Internet connections.

Figure 1. Internet Screen for URSGDOCS

Later, we added a “Fax This Page” button to the bottom of all the reports that a user might wish to retrieve from United's Daily Operations Control System. A retired 386DX40 was given 32MB of RAM and an eight-port Comtrol Rocketport board. It now runs multiple PPP dial-in sessions, various network sniffers and all of the URSGDOCS faxing subsystem. The URSGDOCS faxing subsystem is a custom Perl script wrapped around the efax07a package, a virtual X server, a few Netscape “-remote” commands and Ghostscript. The result is the ability to fax any URSG Daily Operations Control System report directly to any fax machine in the world, just by clicking on that button.

United's Omaha office grew to the point that dedicated T1 connectivity was deemed necessary. A Linux server was installed in that office in December 1996. The custom file retrieval and archive commit program now runs in Omaha as well. Furthermore, ssh (Secure Shell) is used to move those files automatically to the main server in Jacksonville.

We ran across a great deal on some DEC Alpha UDBs (universal desktop box) and initially picked up four of them for United. Red Hat Alpha Linux allowed us to spread some of the server tasks across those boxes. The URSG Daily Operations Control System was moved to alpha2, for example. Alpha4 was assigned the role of running the old 26GB magneto-optical jukebox which had been collecting dust for a few months. By wrapping our own custom backup scripts around Gerd Knorr's jukebox disk-changer package, we have almost eliminated the need for United to perform tape-based backups. Better yet, nightly incrementals of everything imaginable happen automatically to mid-line storage. Detailed reports of what was backed up, what incrementals were pruned due to age, and the disk-usage status of everything are waiting for the system administrators each morning.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix