High-Performance Networking Programming in C
TCP/IP network programming in C on Linux is good fun. All the advanced features of the stack are at your disposal, and you can do lot of interesting things in user space without getting into kernel programming.
Performance enhancement is as much an art as it is a science. It is an iterative process, akin to an artist gingerly stroking a painting with a fine brush, looking at the work from multiple angles at different distances until satisfied with the result.
The analogy to this artistic touch is the rich set of tools that Linux provides in order to measure network throughput and performance. Based on this, programmers tweak certain parameters or sometimes even re-engineer their solutions to achieve the expected results.
I won't dwell further upon the artistic side of high-performance programming. In this article, I focus on certain generic mechanisms that are guaranteed to provide a noticeable improvement. Based on this, you should be able to make the final touch with the help of the right tools.
I deal mostly with TCP, because the kernel does the bandwidth management and flow control for us. Of course, we no longer have to worry about reliability either. If you are interested in performance and high-volume traffic, you will arrive at TCP anyway.
Once we answer that question, we can ask ourselves another useful question, “How can we get the best out of the available bandwidth?”
Bandwidth, as defined by Wikipedia, is the difference between the higher and lower cutoff frequencies of a communication channel. Cutoff frequencies are determined by basic laws of physics—nothing much we can do there.
But, there is a lot we can do elsewhere. According to Claude Shannon, the practically achievable bandwidth is determined by the level of noise in the channel, the data encoding used and so on. Taking a cue from Shannon's idea, we should “encode” our data in such a way that the protocol overhead is minimal and most of the bits are used to carry useful payload data.
TCP/IP packets work in a packet-switched environment. We have to contend with other nodes on the network. There is no concept of dedicated bandwidth in the LAN environment where your product is most likely to reside. This is something we can control with a bit of programming.
Here's one way to maximize throughput if the bottleneck is your local LAN (this might also be the case in certain crowded ADSL deployments). Simply use multiple TCP connections. That way, you can ensure that you get all the attention at the expense of the other nodes in the LAN. This is the secret of download accelerators. They open multiple TCP connections to FTP and HTTP servers and download a file in pieces and reassemble it at multiple offsets. This is not “playing” nicely though.
We want to be well-behaved citizens, which is where non-blocking I/O comes in. The traditional approach of blocking reads and writes on the network is very easy to program, but if you are interested in filling the pipe available to you by pumping packets, you must use non-blocking TCP sockets. Listing 1 shows a simple code fragment using non-blocking sockets for network read and write.
Listing 1. nonblock.c
/* set socket non blocking */
fl = fcntl(accsock, F_GETFL);
fcntl(accsock, F_SETFL, fl | O_NONBLOCK);
void
poll_wait(int fd, int events)
{
int n;
struct pollfd pollfds[1];
memset((char *) &pollfds, 0, sizeof(pollfds));
pollfds[0].fd = fd;
pollfds[0].events = events;
n = poll(pollfds, 1, -1);
if (n < 0) {
perror("poll()");
errx(1, "Poll failed");
}
}
size_t
readmore(int sock, char *buf, size_t n) {
fd_set rfds;
int ret, bytes;
poll_wait(sock,POLLERR | POLLIN );
bytes = readall(sock, buf, n);
if (0 == bytes) {
perror("Connection closed");
errx(1, "Readmore Connection closure");
/* NOT REACHED */
}
return bytes;
}
size_t
readall(int sock, char *buf, size_t n) {
size_t pos = 0;
ssize_t res;
while (n > pos) {
res = read (sock, buf + pos, n - pos);
switch ((int)res) {
case -1:
if (errno == EINTR || errno == EAGAIN)
continue;
return 0;
case 0:
errno = EPIPE;
return pos;
default:
pos += (size_t)res;
}
}
return (pos);
}
size_t
writenw(int fd, char *buf, size_t n)
{
size_t pos = 0;
ssize_t res;
while (n > pos) {
poll_wait(fd, POLLOUT | POLLERR);
res = write (fd, buf + pos, n - pos);
switch ((int)res) {
case -1:
if (errno == EINTR || errno == EAGAIN)
continue;
return 0;
case 0:
errno = EPIPE;
return pos;
default:
pos += (size_t)res;
}
}
return (pos);
}
Note that you should use fcntl(2) instead of setsockopt(2) for setting the socket file descriptor to non-blocking mode. Use poll(2) or select(2) to figure out when the socket is ready to read or write. select(2) cannot figure out when the socket is ready to write, so watch out for this.
How does non-blocking I/O provide better throughput? The OS schedules the user process differently in the case of blocking and non-blocking I/O. When you block, the process “sleeps”, which leads to a context switch. When you use non-blocking sockets, this problem is avoided.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- New Products
- RSS Feeds
- Tech Tip: Really Simple HTTP Server with Python
- Epistle
39 min 47 sec ago - Automatically updating Guest Additions
1 hour 48 min ago - I like your topic on android
2 hours 34 min ago - Reply to comment | Linux Journal
2 hours 56 min ago - This is the easiest tutorial
9 hours 10 min ago - Ahh, the Koolaid.
14 hours 49 min ago - git-annex assistant
20 hours 48 min ago - direct cable connection
21 hours 11 min ago - Agreed on AirDroid. With my
21 hours 21 min ago - I just learned this
21 hours 25 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




Comments
TCP defects
When talking about TCP defects I think it is important to mention that probably the biggest one is the fact that TCP is most commonly used with HTTP, for which it is not suited very well.
---
Alexander (Sasha) Sirotkin
Metalink Broadband
My blogs:
Blogger
LiveJournal
When talking about TCP
quote:
When talking about TCP defects I think it is important to mention that probably the biggest one is the fact that TCP is most commonly used with HTTP, for which it is not suited very well.
---
Alexander (Sasha) Sirotkin
Metalink Broadband
unquote
I only got one thang to say about the above statement, "non sequitur".
non sequitur is a short form
non sequitur is a short form of - this Sasha Sirotkin reminds me a guy who talks about things he's heard but has no knowledge about ...
I would advise you to read a
I would advise you to read a bit more on this subject than just the above article before engaging in technical conversation about TCP/IP.