Linux Network Programming, Part 3

This month we are presented with an introduction to the networking of distributed objects and the use of CORBA.

In the last few articles in this series, we dealt with basic low-level network programming in Linux, and with the issues involved in developing network servers (daemons). However, coding at this low level is extremely tedious and prone to error as a result.

Nevertheless, distributing an application over the network provides significant benefits over a monolithic, stand-alone structure—and these benefits make the additional development investment (in terms of time and money) worthwhile.

In this article, we will introduce an approach intended to reduce the added complexity in writing a network application by applying object-oriented techniques to the development and coding—namely, the use of the Common Object Request Broker Architecture (CORBA).

In the next article, we will present different software packages, available for Linux, which implement the CORBA standards. In addition, a basic introduction to developing applications with CORBA will be given.

Note that this article assumes a basic familiarity with C, C++, object-oriented development, the BSD socket API and RPCs. It presents the key concepts of CORBA which need to be understood before developing applications. These concepts, and the associated terminology, comprise much of the learning curve of CORBA. This article is necessarily theoretical in order to develop a vocabulary to be used when developing with CORBA.

The Benefits of Distributed Computing

The motivations for structuring a system as a set of distributed components include:

  • An increased ability to collaborate through better connectivity and networking

  • The potential to increase performance and efficiency through the use of parallel processing

  • Increased reliability, scalability and availability through replication

  • A greater cost-effectiveness through the sharing of computing resources (for example, printers, disk and processing power) and the implementation of heterogeneous, open systems

Along with these impressive advantages comes additional complexity to the development process. The programmer now has to handle situations like the following:

  • Network delay and latency

  • Load balancing between the different processing elements in the system

  • Ensuring the correct ordering of events

  • Partial failure of communications links, in particular with regard to immutable transactions (i.e., transactions which must yield the same result if repeated—for example, the accidental processing of a bank transaction twice should not erroneously affect the account balance)

One of the earlier techniques in reducing the programmer's work (to develop these distributed applications) was the remote procedure call (RPC).

Remote Procedure Calls

RPCs have already been introduced in Linux Journal in an excellent article by Ed Petron, “Portable Database Management with /rdb”, October 1997. The motivation behind RPCs is to make network programming as easy as making a traditional function call. Linux provides the Open Network Computing flavor of remote procedure calls (ONC-RPC), which was originally developed by Sun Microsystems (Mountain View, CA) for use in its Network File System (NFS) implementation.

RPCs do nothing more than hide the implementation details of creating sockets (network endpoints), sending data and closing the sockets as required. They are also responsible for converting the (remote) procedure's (local data) parameters into a format suitable for network transportation and again into a format that can be used on the remote host. This network-transparent transmission of data ensures both end machines can interpret the information content correctly.

Quite often, different representation issues are involved in this process (for example, byte ordering, ASCII to EBCDIC, etc.). Additionally, pointers (references to address locations in a virtual memory space) cannot be passed directly from one process/machine to another. Instead, a “deep copy” needs to be performed, copying the data pointed to by the pointer which is then sent “across the wire”. The entire procedure of transferring the data from the native format of a machine into a network-transparent format is called “marshalling”. The reverse procedure is called “de-marshaling”. The ONC-RPC uses a special type of marshaling called external data representation (XDR). Figure 1 shows the sequence of events involved in a remote procedure call. The mechanisms and semantics of ONC-RPCs are discussed in much greater detail in RFC 1831, RFC 1832 and RFC 1833.

Figure 1. Remote Procedure Call (RPC)