Linux Network Programming, Part 1

This is the first of a series of articles about how to devlop networked applications using the various interfaces available on Linux.
Creating the Corresponding Client

The client code, shown in Listing 2, is a little simpler than the corresponding server code. To start the client, you must provide two command-line arguments: the host name or address of the machine the server is running on and the port number the server is bound to. Obviously, the server must be running before any client can connect to it.

In the client example (Listing 2), a socket is created like before. The first command-line argument is first assumed to be a host name for the purposes of finding the server's address. If this fails, it is then assumed to be a dotted-quad IP address. If this also fails, the client cannot resolve the server's address and will not be able to contact it.

Having located the server, an address structure is created for the client socket. No explicit call to bind() is needed here, as the connect() call handles all of this.

Once the connect() returns successfully, a duplex connection has been established. Like the server, the client can now use read() and write() calls to receive data on the connection.

Be aware of the following points when sending data over a socket connection:

  • Sending text is usually fine. Remember that different systems can have different conventions for the end of line (i.e., Unix is \012, whereas Microsoft uses \015\012).

  • Different architectures may use different byte-ordering for integers etc. Thankfully, the BSD guys thought of this problem already. There are routines (htons and nstoh for short integers, htonl and ntohl for long integers) which perform host-to-network order and network-to-host order conversions. Whether the network order is little-endian or big-endian doesn't really matter. It has been standardized across all TCP/IP network stack implementations. Unless you persistently pass only characters across sockets, you will run into byte-order problems if you do not use these routines. Depending on the machine architecture, these routines may be null macros or may actually be functional. Interestingly, a common source of bugs in socket programming is to forget to use these byte-ordering routines for filling the address field in the sock_addr structures. Perhaps it is not intuitively obvious, but this must also be done when using INADDR_ANY (i.e., htonl(INADDR_ANY)).

  • A key goal of network programming is to ensure processes do not interfere with each other in unexpected ways. In particular, servers must use appropriate mechanisms to serialize entry through critical sections of code, avoid deadlock and protect data validity.

  • You cannot (generally) pass a pointer to memory from one machine to another and expect to use it. It is unlikely you will want to do this.

  • Similarly, you cannot (generally) pass a file descriptor from one process to another (non-child) process via a socket and use it straightaway. Both BSD and SVR4 provide different ways of passing file descriptors between unrelated processes; however, the easiest way to do this in Linux is to use the /proc file system.

Additionally, you must ensure that you handle short writes correctly. Short writes happen when the write() call only partially writes a buffer to a file descriptor. They occur due to buffering in the operating system and to flow control in the underlying transport protocol. Certain system calls, termed slow system calls, may be interrupted. Some may or may not be automatically restarted, so you should explicitly handle this when network programming. The code excerpt in Listing 3 handles short writes.

Using multiple threads instead of multiple processes may lighten the load on the server host, thereby increasing efficiency. Context-switching between threads (in the same process address space) generally has much less associated overhead than switching between different processes. However, since most of the slave threads in this case are doing network I/O, they must be kernel-level threads. If they were user-level threads, the first thread to block on I/O would cause the whole process to block. This would result in starving all other threads of any CPU attention until the I/O had completed.

It is common to close unnecessary socket file descriptors in child and parent processes when using the simple forking model. This prevents the child or parent from potential erroneous reads or writes and also frees up descriptors, which are a limited resource. But do not try this when using threads. Multiple threads within a process share the same memory space and set of file descriptors. If you close the server socket in a slave thread, it closes for all other threads in that process.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Great tutorial. Thx. I

Nazgob's picture

Great tutorial. Thx. I recomment also Beej tutorial.

REQ:multiple client - server communication

Anonymous's picture

good explanation for starters, i have a question, how does the server able to maintain the communication between the multiple clients? how does the server identifies that this particular message have come from this particular client only?

help me out!

answer

swaroop's picture

hi,
u have asked a nice question..........

A) when multiple clients connect to a server at first we r using "listen" which creates an socket and then accepts the connections from a client at this point an another socket is created and the original socket "listen" will remains available for future connections and this listen socket behaves as a file descriptors gives u a method of serving with multiple clients...

And u asked one more question how the server identifies , this is done by u r OS(operating system) maintains a table in the kernel that which client is connecting to which server...

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState