LiS: Linux STREAMS
The input/output system in UNIX is far from simple and involves many different modules: networking involves different protocol stages arranged in protocol stacks; terminal I/O involves different “line disciplines” stacked over (perhaps network) devices. All those modules perform some processing on existing I/O data flows.
In Linux and most BSD systems, I/O modules live inside the kernel and the relations between them are more or less hard-wired into the code. As an example, the TCP/IP protocol stack is a carefully programmed set of modules with strong interrelations. It is designed to work well on typical configurations.
STREAMS is a flexible input/output system, initially designed to overcome the inflexibility found in previous UNIX systems (see Resources 1). It is an alternative to sockets and is used in most commercial UNIX versions. Some sort of STREAMS is needed if we ever want networking software from systems such as Solaris, Unixware, etc. to run off-the-shelf on Linux.
A STREAM (see Figure 1) is, in essence, a dynamically configurable stack of modules. Each module does some processing on a data flow as it goes from the device to the user or vice-versa. The user perceives a STREAM as a file. It is handled with the usual open, read, write, ioctl and close system calls. Data written by the user is packaged into messages, which are sent downstream. Data read by the user comes from messages sent upstream by an underlying device driver.
Figure 1. Components of a STREAM
A couple of additional system calls, putpmsg and getpmsg, allow the user to send and receive STREAMS messages directly. Yet another system call, poll, provides an alternate interface for select. Therefore, each STREAM is composed of these elements:
A mandatory STREAM head talks to the user process doing I/O. The head fills the gap between user system calls and the message flow. Thus, a write into the STREAM is handled by the head by sending a message downstream. Conversely, a data message going upstream is used by the head to service read system calls on the STREAM.
A (possibly empty) stack of STREAM modules typically performs some computation on messages passing by and forwards them either upstream or downstream. For example, IP on X.25 encapsulation (IXE) could be implemented as a STREAMS module; IP packets would be (de)encapsulated as they pass by the IXE module. A terminal-line-discipline module is another example; typed characters can be cooked as they cross the line-discipline module. A packet-sniffer module could thus be used as a diagnostic or debugging tool.
A mandatory STREAM driver interconnects the STREAM to the device sitting below it. STREAM drivers can also be software only; for example, a STREAMS driver could be used to implement an SNMP MIB for the kernel, or a driver could be written to emulate the behaviour of a true hardware driver for development purposes.
A nice property of STREAMS is that different modules (or drivers) can be decoupled quite easily. Hence, they could be developed independently by different people who don't know the actual protocol stack where they will be used, provided the interfaces between the various modules and drivers are well-defined. STREAMS includes standard interfaces for the transport, network and data link layers. In addition, modules can be dynamically “pushed” onto (and popped off) the STREAM, which is a very convenient feature.
Finally, special multiplexor drivers allow several STREAMS to be multiplexed into another one (or ones). The ip module in Figure 2 is a multiplexor. In this example, it multiplexes both TCP and IP messages using either an Ethernet driver or an IP-on-X.25 driver. A full STREAMS network can be built (see Figure 2), and many different protocol stacks can be set up dynamically for operation.
Figure 2. A STREAMS Network
Before Linux was available, Dennis M. Ritchie designed STREAMS for the ninth edition of UNIX (see Resources 5). Since then, the STREAMS concept has been improved and revised by different operating systems. Variants ranging from UNIX SVR4 STREAMS (see Resources 1) to Plan 9 Streams (see Resources 3) exist today.
Unfortunately, SVR4 complicated the neat and clean design of the ninth edition's STREAMS mainly to add new features, such as atomic gather/scatter writes and multiprocessor support. Different devices such as pipes and fifos were also re-implemented using STREAMS (see Figure 3).
Figure 3. A STREAMS-Based Pipe
Despite being far more complex than Ritchie's Streams, SVR4 STREAMS simplify the construction of network software. Indeed, most networking software for UNIX System V machines is written using STREAMS (including the socket interface). We wanted to be able to run SVR4 driver software under Linux.
After some of us independently started to develop what would become LiS, we met on c.o.l.a. and decided to coordinate our efforts. An LiS project, LSM, was posted in March 1995 and the project began.
There are several reasons to use STREAMS: standard service interfaces to different layers (DLPI for the data link layer, NPI for the network layer and TPI for the transport layer), flexibility and SVR4 compatibility.
What we like about STREAMS under SVR4 is we can write device drivers conforming to a powerful but not Byzantine API (DLPI or TPI, in particular) and have existing network services (DLPI) or existing “applications” (read non-kernel code) work with the device with no additional effort.
Protocol stages (or modules) can be dynamically added/removed. Imagine you are debugging a transport layer interface (TLI) for X.25, and you can push and pop an x25_tli module many times. That is, it is an open framework. Those modules employed can, of course, be shared and reused in different places. With sockets, you have only what the kernel has.
The bottom line is that standards are a “Good Thing”. In the era of distributed systems, this applies equally to kernel-level network and communication interfaces. The STREAMS framework, APIs and service interfaces were designed by intelligent people at AT&T Bell Labs. The result is a mechanism which is clean, comprehensive and elegant to boot.
It has been argued against STREAMS that they are too complicated and too slow when compared to BSD sockets. A related argument is that TCP/IP networking is done more efficiently with BSD-style protocol stacks.
Consider this: Linux TCP/IP networking code can be used as-is with LiS. The purpose of LiS is simply to have the STREAMS framework available, not to replace the Linux TCP/IP protocol stack. Existing network software is perceived by STREAMS users as the dashed box in Figure 2. Fake modules interfacing with existing Linux drivers and protocol stacks are all we need.
With respect to simplicity, STREAMS make certain things, such as “deep” protocol stacks, more simple when compared to sockets. Sockets were designed for implementing TCP/IP-type networking and, although simple, are not extensible. That is, you can't easily use the sockets mechanism to build deep protocol stacks of which the sockets have no built-in knowledge. Each time the Internet protocol suite needs another layer, some hack is likely to be made to sockets.
The Internet protocol suites are deeper than the kernel implementors like to think they are. Consider TCP/IP when IP is sent using X.25 packets, transmitted with LAPB frames through a driver. Now you have a TCP/IP <-> X.25 <-> LAPB <-> driver stack. Then add another protocol over TCP/IP (say, NFS) and interpose frame relay (FR). The stack becomes: NFS <-> TCP/IP <-> X.25 <-> LAPB <-> FR <-> driver. Sockets are not equipped to build protocol stacks such as these that were not originally designed into it. It can be done, but is much easier and cleaner with STREAMS.
SVR4 STREAMS and LiS are much more complex than ninth edition STREAMS, but the added complexity is mostly to the STREAMS implementation and is hidden from the driver or module programmer.
Most UNIX features have been available in source form for people to read and use. STREAMS was a notable exception. Therefore, even though we could have designed LiS to support just the STREAMS interface, we also tried to follow its design. If SVR4 STREAMS code had been available, the project could have been reduced to a simple port. As a result, the design guide was a mixture of a couple of books showing how STREAMS work (see Resources 2 and 4).
Availability of source code for the Linux kernel was crucial, as LiS requires small changes to existing kernel subsystems.
Starting from scratch, our aim was to make LiS portable so that other people could avoid rewriting it for use on different systems. By replacing a single small module, the whole framework can be ported to different operating systems. LiS portability is demonstrated by the fact that Gcom has ported it to QNX (a UNIX flavor). Ports for BSD UNIX system and even NT could be done without much effort in the future.
A complete STREAMS description would be too large for this article. Some books you might read to learn more about STREAMS can be found in Resources. In a few words, LiS features include:
support for typical STREAMS modules and drivers
ability to use binary-only drivers
convenient debugging facilities
Many similarities exist between the implementation of LiS and SVR4 STREAMS. This is because initial project members followed the “Magic Garden” (see Resources 2) as a design guideline. Current maintainers were also heavily influenced by SVR4 STREAMS, because they had been writing STREAMS drivers for SVR4 since 1990. Thus, the stream head structure, queue structure, message structure, etc., follow the SVR4 model.
Differences between the two do exist. SVR4 disallows STREAMS multiplexors to use the same driver at more than one level of the stack. For example, if we had a STREAMS multiplexor driver called “DLPI” and another called “NPI”, the SVR4 STREAMS would disallow the stack: NPI(SNA) <-> DLPI(QLLC) <-> NPI(X.25) <-> DLPI(LAPB). LiS allows these combinations, since we could see no harm in such configurations.
The configuration file used for LiS is modeled after the SVR4 sdevice and mdevice files. However, LiS syntax is different and combines into a single file the functions that SVR4 used two files to specify. The LiS build process (Makefiles) allow individual drivers to have their own config file. They all get combined into one master config file, which is then used to configure LiS at build time.
In SVR4, the STREAMS executive is a linkable package for the kernel. It is not hard-wired into the kernel. With LiS, the STREAMS executive is actually a runtime, loadable module of the kernel, one step more dynamic than SVR4 STREAMS.
A quick overview of the LiS implementation would reveal a STREAM as a full-duplex chain of modules (see Figure 4). Each one consists of a queue pair: one for data being read and another one for data being written. Each module has several data structures providing those operations (i.e., functions) needed, as well as statistics and other data.
Figure 4. Queues in a STREAM
Module operations are provided by the programmer and include procedures used to process upstream and downstream messages. Messages can be queued for deferred processing, as LiS guarantees to call service procedures when queued messages could be processed.
Most of the LiS implementation deals with these queues and also with the message data structures used to send data through the STREAM. Messages carry a type code and are made of one or more message blocks. Only pointers to messages are passed from one module to the next, so there is no data copy overhead.
The head of the STREAM is another interesting piece of software. In Figure 5, you can see how it is reached from the Linux VFS (Virtual File System) layer which interfaces the kernel with the file systems. Note that even though Linux does not have a clean and isolated VFS layer, Linux i-nodes are v-nodes in spirit and its file system layer can be considered to be a VFS. For an actual description of the implementation, read Chapter 7 of the “Magic Garden” (Resources 2).
Figure 5. The STREAM Head
LiS also makes provision for linking with binary-only drivers. This allows companies such as Gcom which have proprietary drivers to port their driver code to LiS and distribute binaries. This is an important feature if we expect companies to port their existing SVR4 STREAMS drivers to LiS. The more of these available, the more the Linux kernel functionality is enhanced.
LiS debugging features are especially convenient and show another departure point from SVR4.
Of course, these facilities include some general-purpose debug utilities such as message printers, but also included are significant aids that can really help with debugging, such as the ability to selectively trace; for example, getmsg calls.
The memory allocator keeps file and line numbers close to allocated memory areas. Combine that with the ability to print out all the in-use memory areas, and you have a tool for finding memory leaks in your drivers.
Usage statistics are designed to help, not overload the user with unnecessary information. The streams command prints out a good deal of useful information about LiS operation. There is even a debug bitmap to cause LiS to trigger different debug facilities. One of them is the ability to time various operations using the high-resolution timer. Thus, the user can get fine-grain driver timings for those drivers using LiS tools with no extra code in the driver.
Last but not least, LiS allows module debugging in user space by emulating the whole STREAMS framework. A module can be easily developed in user space and then downloaded into the kernel when it works. That is achieved by a “port” of LiS which runs in user space on Linux (in a dummied-up manner).
STREAMS modules can be tested by surrounding them with test modules and then driving known sequences of messages through the module under test. The LiS loop driver is suitable to place below the driver being tested, as it behaves like a simple echo server. The stream head may very well be all that is needed above.
The whole TCP/IP stack can be reused; thus, TCP/IP performance with STREAMS is a non-issue. LiS comes with an adapter driver that fits below standard Linux IP and interfaces off to STREAMS drivers using DLPI. Gcom uses this to interface their STREAMS-based Frame Relay and (soon) X.25.
Also, a contributed driver that will be distributed with LiS (sitting in Dave's inbox as of this writing) sits on top of any Linux MAC (mandatory access control) driver as a client and presents a DLPI interface above. Gcom will probably use this driver to interface its SNA (system network architecture) to the Linux token-ring driver.
LiS is licensed using the GNU Library Public License so that companies can port their existing SVR4 proprietary STREAMS drivers to LiS and use them in Linux without having to publish their source code. This is important if we are to encourage companies to support Linux with their “family jewels” products.
It would help if support needed to run LiS could be included in the mainstream kernel. We are referring mainly to the new system calls and other small hooks, not to LiS itself. This support would make it easier for people to download LiS and install it without having to patch the kernel.
Graham Wheeler (firstname.lastname@example.org) obtained his Ph.D. in computer network performance analysis at the University of Cape Town in 1996. He subsequently spent conciderable time developing STREAMS device drivers and modules for protocol translation to enable a number of financial institutions to connect to PayNet, a large electronic commerce payment clearing center. He is a founder and technical director of Citadel Data Security, specializing in Internet firewall and Virtual Private Network software development.
Francisco J. Ballesteros (email@example.com) his Ph.D. in Computer Science in 1998 at the Technical University of Madrid (Spain). He is currently teaching and doing research on distributed and adaptable operating systems at Carlos III University of Madrid in strong cooperation with the Systems Software Research Group of the University of Illinois at Urbana-Champaign.
Denis Froschauer was a significant contributor to LiS development during its early implementation stage.
David Grothe (firstname.lastname@example.org) is president of Gcom, Inc. Gcom produces data communications protocol stacks for UNIX systems, including Linux. Mr. Grothe founded Gcom in 1979 after working for the company now known as Advanced Computer Communications (ACC) where he wrote his first implementation of X.25 in 1977. Prior to that, he was a professional programmer at the University of Illinois Urbana-Champaign.