Linux in the Real World
This article describes an exciting and important new system that is being built on Virginia Power's Linux platform—a virtual SCADA (Supervisory Control and Data Acquisition) system that will provide a cost-effective and flexible alternative to traditional SCADA connectivity. Whereas last month's story was an epic software adventure with a cast of several, this month's story is more of a documentary of the new additions being added to our Linux house. As this is being written, the frame is up, the roof is on, and the drywall is in place. However, the joists are still visible and a good bit of plumbing has yet to be added. Don't worry—with virtual hard hats in place and source code hammers handy, we should be able to visualize the finished rooms easily. (Besides, as you read this, the system is finished. Magazines are wonderful time machines.)
But first, for those of you who may have missed last month's article, a little foundation on SCADA. As far as electric utilities are concerned, SCADA means the retrieval of real-time analog and status data from various locations in the service territory through remote terminal units (RTUs) installed in substations. This information is obtained by central master computers, where it is stored, analyzed, and presented to system operators who are responsible for maintaining the integrity and reliability of the transmission and distribution grid. When necessary, these operators can also remotely operate field devices like line breakers and capacitor banks by sending control commands from the master computer out to the RTUs. The master computers themselves even contain feedback algorithms that automatically operate some devices based on system conditions.
At Virginia Power, the standard medium of communication between RTUs and SCADA master computers is the dedicated serial line, often leased from the local phone company. Several RTUs can be multi-dropped off a single dedicated line (up to 16, a limitation imposed by our currently-used SCADA protocol), but geographical limitations tend to prevent as much sharing of dedicated lines as might be desirable. At the risk of over-simplifying, we can imagine one dedicated line per RTU, giving us a traditional SCADA system something like that shown in Figure 1 (see below).
The advantages of a dedicated connection are pretty obvious: constant data availability and quick response when system conditions require a control action of some sort (such as opening or closing a breaker or capacitor bank). In the case of generation stations or large, high-voltage substations, any other type of monitoring is unthinkable.
Yet, there are other likely monitoring sites, often in remote locations (the exact technical phrase is “in the middle of nowhere”) which are not quite so high profile (or high pressure). As a matter of fact, in a data acquisition sense, these potential sites are downright prosaic: two or three analog points, a couple of status points, and perhaps a single control point. Such modest monitoring needs don't justify the constant watchfulness a dedicated serial line provides, but the information does need to be retrieved; the control capabilities do need to be available when needed.
Over the years some partial solutions have been implemented. In many cases, intelligent electronic devices such as digital relays can monitor a small number of analog and status devices and supply sufficient control capabilities. These relays usually implement a simple serial-based protocol; installed in remote sites along with modems, they can be interrogated from the SCADA master computer centers using stand-alone PC packages, resulting in a hybrid SCADA system as shown in Figure 2 (see above).
Such a hybrid system, while it may provide (in one form or another) all necessary data and control capabilities, fails to provide the system operators with a unified picture of the system they are operating. These folks have enough responsibility on their hands without having to run through mental gyrations along the lines of: “I'd better check the voltages in the Berry district. Oops—the Cranberry substation has to be dialed, so I'll just walk over here to the dialup PC and...oh, heck. Ted's using the PC to dial the Mineral Water substation! Guess I'll try later...”
Of course, a second dialup PC could be purchased and then a third and a fourth and—Whoa—just a minute! What about those eminently reliable and flexible Linux systems in each and every SCADA master computer center? (I know you saw this coming.) Not only are those Linux systems handling averaged analog data for the Asset Management database system, but they are also an eminently reliable and flexible dialing subsystem (I blush to admit this, but there you are) which can potentially talk to any type of device with a byte-oriented protocol!
The dialing subsystem has no set limit on the number of phone lines it can handle. If some means could be found to move data back and forth between the SCADA master computers and the Linux systems, a more ideal SCADA system could be constructed, as shown in Figure 3 (see below). This would provide the system operators with a unified overview of their system—all information would be present in the SCADA master computer. Some of it would be retrieved via traditional means (dedicated lines) and the rest would be obtained via dialup connections through the Linux systems.
As you can probably guess, this last approach is pretty much the one we're taking, although a few sobering, but fortunately not insurmountable, realities of the Real World have intruded:
Our SCADA master computers are older machines which are approaching the end of their digital careers. There are no spare processor cycles (or memory bytes) for any kind of special programs to accommodate talking with our Linux systems. In fact, the only feasible way to move data between our Linux systems and our SCADA computers is by using the same protocol as is used to communicate with our RTUs. Alas, this protocol is more than a little antiquated and uses special-purpose modems and encoding firmware.
Dialing devices and retrieving data is all well and good, but sometimes system operators need to monitor data points continuously for a certain period of time.
When operators perform control actions on remote devices, they need to see immediate feedback to determine the success or failure of the controls. Some actions can potentially affect several different data points, and these need to be updated in real time until the operators are satisfied as to the results of their control actions.
With regard to the first item, we are lucky enough to have available an RTU platform for which our group develops field-resident applications, such as closed-loop feedback controls and protocol translators for IEDs. This platform, obviously, contains all of the requisite firmware and hardware for talking to our SCADA master computers and is fully programmable (in C, thank goodness). Stripped of all unnecessary peripheral hardware and loaded with a simple byte-oriented protocol to talk to our Linux systems over a null modem cable, this programmable RTU functions quite handily as a translator box: status and analog data can be delivered from dialup devices to the SCADA computer, and control requests can be delivered from the SCADA computer to the Linux system for appropriate action. Of course, all the SCADA computer knows is that it is scanning another RTU. The result is a slightly tempered ideal system as shown in Figure 4.
At this point, I'd like to mention a few details of software carpentry which will be important when we discuss the remaining items in our Real World reality list. The database of the translator box (i.e., the stripped-down programmable RTU which talks to the SCADA master computer) is organized as a set of arrays of data structures—one array for status points, another for analog points, etc. On the Linux side, a corresponding set of shared-memory partitions mirrors the arrays of data structures in the translator box—one partition for status points, another for analogs, etc. A daemon process in Linux talks to a counterpart process on the translator box and ensures that the corresponding instances of structure arrays remain consistent and up-to-date. This update process runs every few seconds.
“Every few seconds” may sound a trifle vague in connection with real-time data processing, but SCADA activity tends toward the leisurely side of real-time processing; RTUs are scanned once every 2 to 30 seconds, contact closures during control actions may be on the order of several hundred milliseconds to a second or two. So even though Linux (like any standard Unix system) is not strictly speaking a real-time system, it is more than responsive enough for the scale of real-time processing with which we're concerned.
Well, now—the important points to remember are these: A change to data in a shared memory segment in the Linux system will show up in the translator box, where it will be picked up and scanned by the SCADA master computer, eventually showing up on an operator display. Conversely, an operator-control action will change data in the translator box, which will show up on the Linux side and ring a (virtual) bell to cause some action to take place. From now on, we'll ignore the translator box and pretend that the SCADA master computer and Linux system are speaking directly to one another.
Which brings us, in a roundabout way, to the second Real World item. As noted before, continuous data monitoring is one of the advantages of dedicated-line connectivity. Simulating dedicated-line access with regular phone lines is obviously why we're calling our new system a virtual SCADA system, and the basic principle is just as obvious: when continuous monitoring is needed for a dialup device, dial up and stay dialed up!
Of course, at any given time, it is only possible to continuously monitor as many dial-up devices as there are available phone lines—but more phone lines can always be added, should the need arise. We're starting with three dial-up serial ports per Linux machine; time and experience will tell if we need to add more. But some complications arise (as always) in the details. For example, what happens if an operator starts continuously monitoring a dial-up device, gets caught up in some other task, and forgets to release the device so the dial-up line can be used for some other purpose? For that matter, how does the operator start and stop monitoring a device in the first place?
To handle these details, each dial-up device has associated with it a number of pseudo status, analog and control points—points which have nothing to do with the data being monitored by the device, but rather are related to the device itself:
A timestamp analog point, showing how old the device data is (i.e., the last time the device was called).
A connection status point, showing whether the device is on-line or not.
A dial-up control point. Toggling this control will cause the device to be dialed and a connection established.
A connect-time analog point, showing how many minutes remain before the device is automatically disconnected.
An add-connect-time control point. Toggling this point will add a fixed number of minutes to the connect-time analog, keeping the device on-line longer.
A disconnect control point, to disconnect from the device immediately.
An additional pseudo-analog point reports the number of available dial-up lines. This analog point is displayed, along with the above-described pseudo points, for all dial-up devices on a SCADA master computer screen, allowing the system operator easy, centralized management of all dial-up devices.
As an example, let's replay our hypothetical scenario from a few paragraphs back: “I'd better check the voltages in the Berry district. Lessee—the Cranberry substation has to be dialed, so I'll just poke this control point right here at the comfort of my workstation...”
A little time passes while the device is dialed; the operator stays busy with other things. Then the connect pseudo-status changes state and dings an alarm beeper to attract the operator's attention: “Hmm...Cranberry's online now. I'd better keep an eye on those voltages for half an hour or so. I'll poke this add-time control a couple of times...There we go; now I've got 30 minutes of connect time.”
Okay, so it's not a perfect solution; the operator still has to perform some special actions to get his data, and has to know what's a dial-up device and what's not. But all of these extra activities can be done at the operator's regular workstation. And if dial-up devices are scheduled for periodic interrogation, some of these special actions may not even be necessary: “I'd better check the voltages in the Berry district. Say, it looks like the Cranberry substation was interrogated just 10 minutes ago—recently enough that I can use those values...”
As you might imagine, handling pseudo-points and connection timers involves much delightful software development on the Linux side, some of which is still in the blueprint stages and some needing only a final coat of symbol-stripping paint. The solution to the final item in our list of Real World realities—providing operator controls to dial-up devices—is still, pretty much, in the blueprint stage, but we can at least describe the basic ideas.
The main problem with dial-up device controls is providing sufficient generalization so that control actions are handled in a consistent manner. The standard method for controlling SCADA devices is a three-step select-verify-execute procedure: select the point to be controlled, verify the selection (usually by having the remote device echo the selection back to the master computer), and execute the desired control after a final go-ahead by the operator. The result of a control action is usually determined by monitoring an associated status point or one or more analog points.
Unfortunately, many of the intelligent end devices we are handling using virtual SCADA don't have clear- cut sequences of steps for performing control actions. One device, for example, uses an ASCII-encoded bitmap to select the device and execute the control, all in one step—so much for verification. Another device implements the usual 3-step procedure but with the added onus of sequence numbers to ensure no more than one outstanding control action at a time (actually, not at all a bad idea, but incompatible with our existing SCADA protocol). And there is the obvious prerequisite, that the device to be controlled must be on-line before any control is attempted.
A little poor-man's object-orientation seems to be in order, so we have abstracted the basic elements of a control request and, along the way, added a few more pseudo-points per dial-up device (these additional pseudo-points are displayed on the same screen as all the other device pseudo-points):
A connection-in-progress status point, which toggles true if the associated device is in the process of being dialed.
A control-in-progress status point, which is true if a control is being performed on the associated device.
A control-success status point, showing success or failure of the last control attempted.
The operator can perform a control on any dial-up device control point, just as he does with any other (dedicated-line) control point, with the understanding that his control action is actually a request for the control to be selected, verified, and executed on his behalf at some time in the (near) future. This distinction may seem cosmetic, but it is actually important, from an operational point of view.
Here's the general sequence of events: The operator controls a dial-up device control point (using the usual 3-step procedure, since he is communicating his request to the Linux system using the regular SCADA protocol), which toggles a database point in the Linux system, alerting the system that there's work to be done. The system sets the control-in-progress status point to ensure that only one control request per device is outstanding at a time. Since the number of separate control points per dial-up device is small, this restriction should not pose a problem.
If the device to be controlled is not online, it is dialed and a connection is established (the connection-in-progress status point allows monitoring of this process). If the device is already on-line, a set amount of time is added to its connect-time analog to allow for completion of the control request.
Device-specific software, knowing all the secrets for successful control actions on the device, performs the control requested by the operator, and reports the general success or failure through the control-success status point. The device remains on-line until its connect-time analog counts down to zero, allowing the operator an opportunity to observe any associated analogs or status points to verify that the control action has the desired effect.
Well, we've just about reached the end of our tour of this latest addition to our network of Linux systems, and I hope you've gotten a good idea of what the finished rooms will look like and the wonderful view we'll have of the increased efficiency and reliable operation of our SCADA systems. But the most remarkable feature of this new system hasn't yet been mentioned, although it has been implied in everything discussed so far.
Linux has become an integral, accepted part of the toolkit we use to craft solutions for the division and planning personnel who come to our group with problems and needs. During the design of our virtual SCADA system, no one suggested using some “other” operating system platform or questioned whether Linux would have enough horsepower to handle the new demands that would be placed upon it. A year of stellar, faultless Linux performance as our data-collection front ends has turned skepticism to happy acceptance and transformed the phrase “that PC Unix” to “our Linux systems.” Folks I've never met who work in our company, call up with Linux questions because they've heard good things about our systems.
Oh, we still have a skeptic or two—I'm sure we always will. But the surest way I've found to get them off my back, after they've expounded on the next release of “Ontario” or “Pookeepsie 96” or “Shangra-La”, is to cough politely and reply, “Well, Linux does that right now. And it works. Right now. See?”
The ones that come back, I tell `em how to get a good CD-ROM distribution. One more happy Linuxer can't hurt!
Vance Petree (vpetreeinfi.net) Although he began adulthood as a music composition major, Vance soon found computers a more reliable means of obtaining groceries. He has been a programmer for Virginia Power for the past 15 years, and lives with his wife (a tapestry weaver—which is a lot like programming, only slower) and two conversant cats in a 70-year-old townhouse deep in the genteel stew of urban Richmond, VA.
- Bruce Nikkel's Practical Forensic Imaging (No Starch Press)
- Transitioning to Python 3
- Progress on Privacy
- Stepping into Science
- Linux Journal December 2016
- Radio Free Linux
- CORSAIR's Carbide Air 740
- The Tiny Internet Project, Part II
- FutureVault Inc.'s FutureVault
- A Better Raspberry Pi Streaming Solution