Mission-Critical Application on Linux
At Pace Analytical Services, Inc., the product we produce is data. Pace Analytical is in the environmental testing industry, meaning that our clients bring us air, water and soil samples and ask us to test them for pollutants. What we give back to the client is a report, both electronic and hard copy, that tells them the type and concentrations of compounds discovered in the samples.
The mission-critical application for our company is our Laboratory Information Management System (LIMS). This system's primary goal is to provide fast, accurate and timely information to our clients; it is based on an Oracle relational database. The LIMS handles everything, from sample check-in to invoicing, by modeling the laboratory operations. LIMS eliminates redundant processes and data entry, and allows for greater standardization in areas such as quality-control batches, data reporting and billing throughout the system.
One aspect of our Y2K program was updating the LIMS to be Y2K-compliant. This meant upgrading from 7.0.16 of Oracle RDBMS to 7.3.4, converting all the Oracle Forms 3.0 code to version 4.5 and converting C code to handle dates correctly. The real work in this project turned out to be not the Y2K issues, but rather the conversion of all the Oracle Forms code to work in the new version of the Oracle Forms tool.
Another issue we were facing at the same time as the Y2K development was response time and throughput on our LIMS servers. Over time, additional load on the systems occurred because of a larger number of users and because multiple instances of the database were running on the same server. At the time, Pace Analytical had three servers, and all were HP-UX boxes: one HP9000-H70, one HP9000-I70 and one HP9000-K200, and all had been in use for well over three years. We were running old hardware, and our demand was exceeding its capacity to deliver.
We looked at our options for getting improved response from the hardware. We had already addressed everything we could, given the existing configuration, such as disk striping. One major bottleneck was that the entire SCSI channel was limited to a total 20MB/sec throughput. One option was to add a faster channel. HP had a card that would talk to a SCSI sub-system (RAID), but prices seemed quite high, even on the used market; the best quote we could find was about $10,000 US. Another option was to add more cards to get more 20MB/sec channels. After a close analysis, it became clear that, in addition to the SCSI channel, the hardware was also limited by the CPU and RAM.
Another option discussed was to purchase new HP-UX servers. This represented a sizable investment—well over six figures for three servers. About this time, the Pace Analytical database administrator, Michael Lester, came up with an alternative proposal. He proposed purchasing Intel-based hardware (PC-compatible) and running Oracle 8 on Linux, using Oracle SQL*Net to communicate between the current HP-UX server and the Linux database server. What Michael proposed made sense for a number of reasons. First, if we could move to an Intel-based hardware platform, it would be easier and cheaper to upgrade hardware as demands exceeded the hardware capacity. Intel-based hardware is much cheaper, and is readily available locally in case problems are encountered with components. As hardware is upgraded, an old server can be put to service as a Windows-based desktop system, but the old HP hardware is of no use to us.
The new hardware uses SCSI Ultra2, also a wide format, and each channel has an 80MB/sec throughput. Because of the low cost, we could afford to put in four channels: one for the OS and archive drives, one for the slower tape and CD drives and two for the database. The six drives for the database are configured for RAID and are split—three drives on each of the two database channels. In this configuration, the maximum throughput for the database drives would be around 160MB/sec, an eightfold speed increase over the old hardware. The CPUs are 600MHz dual Pentium III processors. In benchmarking, this turns out to be only about two times faster than the HP RISC processor on the old hardware.
The total cost of one server, including 512MB RAM, was about $10,000. We were able to build a brand-new server with eight times the throughput on the SCSI channel and twice the processor speed for the same cost as upgrading the SCSI subsystem on the existing server. We had conservatively estimated a five to tenfold improvement in performance over the old hardware. The true test, though, is always how a system performs in a live production environment. We were not disappointed. Several report programs that, on the old hardware, would run in the background for 45 to 80 minutes now finish in six minutes or less. One lab supervisor literally ran into the IS area and asked, “What's going on with the LIMS? It is flying!”
The project was not as simple as just building a Linux server. As I mentioned in the beginning, the original plan was to upgrade to Oracle version 7.3.4 on the HP hardware. In order to run Oracle on the Linux OS, we were forced to go to version 8 of the Oracle database. There were no real issues installing Oracle 8 on Linux, with the exception of a minor patch that had to be installed to get it recompiled. The main problem we found with Oracle8 was a newer “rowid” concept it used.
The software conversion to Oracle*Forms 4.5 had to happen regardless of the hardware platform, since version 3 was no longer supported by version 7.3 and later of the database. We opted to use a character-based version of the Oracle*Forms 4.5. To switch to an event-based GUI interface would have meant a wholesale restructuring of the 100+ Forms programs. At first, most of the problems were syntax errors, things that were valid in Forms Version 3 but no longer legal in the new Forms version. A few naming conventions changed, and we happened to have used names for objects (fields/triggers/variables) which where not allowed under the new Forms. Then came problems with “integrating” all of the forms back into the LIMS product. For example, many of the Forms check to see which form/menu called them, and thus operate in a different mode. The function to return the calling name returned it in lower case under Version 3, and upper case in Version 4.5, so the IF statement failed. We added the lower(xxxxx) function to force it to lower case. Also, there are a number of forms that get called from others as pop-ups, and many of the screen layouts were messed up.
We also encountered problems with the HP character mode terminal emulation, which caused the Oracle/HP resource file to become corrupted. Eventually, we had to switch to VT220 emulation, and ran into some issues with the terminal emulation. The VT220 terminals had a strange kind of “screen” memory. Two text screens could be displayed, and a user switched between these by sending codes to the screen. The terminal emulator added a vertical scroll bar, making it possible to scroll back to see what was previously on the screen. This kept messing up scroll regions on the screen, which is what vi and Oracle use to show you a pop-up window. The emulator vendor, Minisoft (http://www.minisoft.com/), was very responsive and corrected the incompatibilities we encountered using their VT220 product.
An additional software complication is that a number of the functions in the Forms code are actually Pro-C routines called from inside the Forms session, known as “User Exits”. These functions were written specifically to run on HP-UX. Because of the complexity of deciphering, converting and testing the User Exits, and the immovable nature of the January 1 deadline, Michael proposed maintaining the HP-UX server to act as a client-side server for the Forms programs and to have it communicate to the database using Oracle SQL*Net. Our ultimate goal is to eliminate the user exits and have them written in native SQL and to convert the character-based interface to a GUI interface. This will make it possible to run three-tiered client-server on all Linux servers, eliminating our aging HP-UX hardware.
I have already discussed one measure of success of the project, the dramatic improvement of the speed of the system. Another measure that must be considered is up time. For people who question whether Linux should be used in a mission-critical application, our project is proof that Linux is up to the task. We installed the first server in August of 1999 and the other two followed one month apart, so one server has been running for five months, one for four months and one for three months. In those twelve months of operation, we have had absolutely no down time due to the Linux operating system. Several issues occurred with the Oracle database configuration, but that is not unexpected in any new installation. We also had an issue with a network interface card on one of the servers, but we replaced it with a Linux-supported card and the issue has not resurfaced.
While rereading this article, it occurred to me that there has been very little mention of Linux. In reality, that has been the case. The true measure of an effective and robust operating system is that it should be fairly transparent to the operation, and that has certainly been true in our application. Without the option of moving to a Linux operating system, our options were either to spend five times as much on new RISC systems, or limp along with much less than optimum performance at a cost equal to what we ended up spending for state-of-the-art hardware. In the case of this project, Linux was the lynchpin around which everything else was built, and it has performed beautifully.
Rolf Krogstad (email@example.com) is Director of Information Services for Pace Analytical Services, Inc., in Minneapolis, MN, with over 18 years experience in the industry. He is a retired violinist, having performed professionally in symphony orchestras in Vienna, Austria and Mexico before becoming a programmer.