High-Performance Networking Programming in C
Listing 4. mmap.c
/****************************************** * mmap(2) file write * * * *****************************************/ caddr_t *mm = NULL; fd = open (filename, O_RDWR | O_TRUNC | O_CREAT, 0644); if(-1 == fd) errx(1, "File write"); /* NOT REACHED */ /* If you don't do this, mmapping will never * work for writing to files * If you don't know file size in advance as is * often the case with data streaming from the * network, you can use a large value here. Once you * write out the whole file, you can shrink it * to the correct size by calling ftruncate * again */ ret = ftruncate(ctx->fd,filelen); mm = mmap(NULL, header->filelen, PROT_READ | PROT_WRITE, MAP_SHARED, ctx->fd, 0); if (NULL == mm) errx(1, "mmap() problem"); memcpy(mm + off, buf, len); off += len; /* Please don't forget to free mmap(2)ed memory! */ munmap(mm, filelen); close(fd); /****************************************** * mmap(2) file read * * * *****************************************/ fd = open(filename, O_RDONLY, 0); if ( -1 == fd) errx(1, " File read err"); /* NOT REACHED */ fstat(fd, &statbf); filelen = statbf.st_size; mm = mmap(NULL, filelen, PROT_READ, MAP_SHARED, fd, 0); if (NULL == mm) errx(1, "mmap() error"); /* NOT REACHED */ /* Now onward you can straightaway * do a memory copy of the mm pointer as it * will dish out file data to you */ bufptr = mm + off; /* You can straightaway copy mmapped memory into the network buffer for sending */ memcpy(pkt.buf + filenameoff, bufptr, bytes); /* Please don't forget to free mmap(2)ed memory! */ munmap(mm, filelen); close(fd);
TCP sockets under Linux come with a rich set of options with which you can manipulate the functioning of the OS TCP/IP stack. A few options are important for performance, such as the TCP send and receive buffer sizes:
sndsize = 16384; setsockopt(socket, SOL_SOCKET, SO_SNDBUF, (char *)&sndsize, (int)sizeof(sndsize)); rcvsize = 16384; setsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char *)&rcvsize, (int)sizeof(rcvsize));
I am using conservative values here. Obviously, it should be much higher for Gigabit networks. These values are determined by the bandwidth delay product. Interestingly, I have never found this to be an issue, so I doubt if this would give you a performance boost. It still is worth mentioning, because the TCP window size alone can give you optimal throughput.
Other options can be set using the /proc pseudo-filesystem under Linux (including the above two), and unless your Linux distribution turns off certain options, you won't have to tweak them.
It is also a good idea to enable PMTU (Path Maximum Transmission Unit) discovery to avoid IP fragmentation. IP fragmentation can affect not just performance, but surely it's more important regarding performance than anything else. To avoid fragmentation at any cost, several HTTP servers use conservative packet sizes. Doing so is not a very good thing, as there is a corresponding increase in protocol overhead. More packets mean more headers and wasted bandwidth.
Instead of using write(2) or send(2) for transfer, you could use the sendfile(2) system call. This provides substantial savings in avoiding redundant copies, as bits are passed between the file descriptor and socket descriptor directly. Be aware that this approach is not portable across UNIX.
Applications should be well designed to take full advantage of network resources. First and foremost, using multiple short-lived TCP connections between the same two endpoints for sequential processing is wrong. It will work, but it will hurt performance and cause several other headaches as well. Most notably, the TCP TIME_WAIT state has a timeout of twice the maximum segment lifetime. Because the round-trip time varies widely in busy networks and networks with high latency, oftentimes this value will be inaccurate. There are other problems too, but if you design your application well, with proper protocol headers and PDU boundaries, there never should be a need to use different TCP connections.
Take the case of SSH, for instance. How many different TCP streams are multiplexed with just one connection? Take a cue from it.
You don't have to work in lockstep between the client and the server. Simply because the protocols and algorithms are visualized in a fixed sequence does not imply that the implementation should follow suit.
You can make excellent use of available bandwidth by doing things in parallel—by not waiting for processing to complete before reading the next packet off the network. Figure 2 illustrates what I mean.
Pipelining is a powerful technique employed in CPUs to speed up the FETCH-DECODE-EXECUTE cycle. Here, we use the same technique for network processing.
Obviously, your wire protocol should have the least overhead and should work without relying much on future input. By keeping the state machine fairly self-contained and isolated, you can process efficiently.
Avoiding redundant protocol headers or fields that are mostly empty or unused can save you precious bandwidth for carrying real data payloads. Header fields should be aligned at 32-bit boundaries and so should the C structures that represent them.
If your application already is in production and you want to enhance its performance, try some of the above techniques. It shouldn't be too much trouble to attack the problem of re-engineering an application if you take it one step at a time. And remember, never trust any theory—not even this article. Test everything for yourself. If your testing does not report improved performance, don't do it. Also, make sure your test cases take care of LAN, WAN and, if necessary, satellite and wireless environments.
Getting Started with DevOps - Including New Data on IT Performance from Puppet Labs 2015 State of DevOps Report
August 27, 2015
12:00 PM CDT
DevOps represents a profound change from the way most IT departments have traditionally worked: from siloed teams and high-anxiety releases to everyone collaborating on uneventful and more frequent releases of higher-quality code. It doesn't matter how large or small an organization is, or even whether it's historically slow moving or risk averse — there are ways to adopt DevOps sanely, and get measurable results in just weeks.
Free to Linux Journal readers.Register Now!
|Secure Server Deployments in Hostile Territory, Part II||Jul 29, 2015|
|Hacking a Safe with Bash||Jul 28, 2015|
|KDE Reveals Plasma Mobile||Jul 28, 2015|
|Huge Package Overhaul for Debian and Ubuntu||Jul 23, 2015|
|diff -u: What's New in Kernel Development||Jul 22, 2015|
|Shashlik - a Tasty New Android Simulator||Jul 21, 2015|
- Secure Server Deployments in Hostile Territory, Part II
- Hacking a Safe with Bash
- KDE Reveals Plasma Mobile
- Huge Package Overhaul for Debian and Ubuntu
- The Controversy Behind Canonical's Intellectual Property Policy
- Home Automation with Raspberry Pi
- Shashlik - a Tasty New Android Simulator
- Embed Linux in Monitoring and Control Systems
- diff -u: What's New in Kernel Development
- General Relativity in Python