Improving Network-Based Industrial Automation Data Protocols

TCP/IP breaks into industrial automation, but not without some problems.
TCP Versus UDP

TCP and UDP are very dissimilar data protocols. To review, TCP is connection-oriented, stream-oriented, with guaranteed delivery and guaranteed order. UDP carries unreliable packets of a limited length, it's connectionless, packet-oriented and arrival order is not guaranteed. Routers may be configured to drop UDP packets to unknown ports (fairly unlikely these days).

The traditional serial data bus is more like UDP in its lack of delivery guarantee and its connectionless. The arrival order, however, is something to cause concern. Also, the packet orientation of the received data may cause excess data to be dropped if the receiving buffer isn't large enough.

TCP is similar to the traditional bus in that data arrives in guaranteed order and appears in a stream-oriented fashion. TCP differs in that each socket is a private virtual connection to each automation device, and guaranteed delivery means the IP stack will retransmit the message if it isn't acknowledged by the receiver.

Between UDP and TCP, other than a shared ability to transmit and receive data, there isn't a lot of similarity between them. This may cause problems when trying to create a data protocol common to both of these IP protocols.

UDP Implementation

Returning to UDP, let's make an assumption that each UDP command and response stays below a network MTU. This would guarantee that the data is encapsulated in one packet. If two packets arrived at the receiver, the receiver API will not concatenate the two packets together. Rather, the first packet would be received and the second packet would move on to the next receiver. But, should the application request less data than is contained in the packet, the remaining data will be dropped. In the UDP model, it is possible for the receiver to know the beginning and end of a data segment.

Furthermore, due to the lack of sequence preservation, the first packet may have been the second packet sent. In UDP, if the data protocol can have multiple and simultaneous data packets sent and received, an application layer sequence ID is required to allow re-ordering of the data packets.

The connectionless nature of UDP means that if the automation device was power cycled, the polling master may not be able to tell it occurred. This is no different than traditional serial devices.

TCP Implementation

TCP has a stream model: if two packets arrived and there isn't anything in the data stream to indicate where the packets split, the application won't be able to discern the first packet from the second packet. Some kind of data size is required to allow the receiver to determine the beginning and end of the information in the stream.

TCP is also connection-oriented. Should the automation device be power cycled, the TCP connections are reset. The polling master would then receive a socket error if it attempted to transmit to a device that doesn't have an established connection.

Another consequence of the connection-orientation is the connection must be gracefully closed. This graceful closure requires data packets to be transferred between the polling host and the automation device. Should the polling master close the connection and is some kind of infrastructure failure occurred, the automation device may be left with a permanently open TCP session (a resource leak). If this happens too many times, the automation device may run out of resources.

TCP guarantees arrival, which sounds like a great thing. The dreadnought nature of TCP has one giant Achilles heel--time. TCP generally makes three transmission attempts to receive an acknowledgement from the receiver. The first timeout is short, the next is longer and the last is eternity; most automation scanners won't wait this long. In addition, failed infrastructure, power cycling or a cable removal could cause this problem, which might also tie up resources. Because many automation devices use IP stacks of limited resources, carelessly creating (or leaking) TCP resources may exhaust the automation device's IP resources.

Parallelization

Because traditional serial data streams had high transmission latency, the data packets were kept as small as possible (typically, well under 256 bytes). This small data size only allows a single data write or a data read request to be sent at a time. With significantly larger MTUs, data packets as large a 1,400 bytes could be handled, including a read and a write request simultaneously. This would decrease the number of sends and receives by a factor of two, reducing network traffic and making the polling scanner more efficient.

Also, Ethernet virtually allows parallel data via the collision resolution. However, with switch infrastructure, this parallel model can be exploited further. It's possible that a data read or write request could be broadcasted or multicasted to automation devices; this would allow simultaneous updates with a single send, as well as a parallel read receive with a single read request. This feature isn't possible on devices that don't support transmission resolution.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

SCTP,

Anonymous's picture

I must say, I cant wait for SCTP[RFC 2960] to get in the main kernel.

-http://people.cs.uchicago.edu/~piggy/sctp_refcode/

-http://www.sctp.de

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix