High-Performance Networking Programming in C
Listing 4. mmap.c
/****************************************** * mmap(2) file write * * * *****************************************/ caddr_t *mm = NULL; fd = open (filename, O_RDWR | O_TRUNC | O_CREAT, 0644); if(-1 == fd) errx(1, "File write"); /* NOT REACHED */ /* If you don't do this, mmapping will never * work for writing to files * If you don't know file size in advance as is * often the case with data streaming from the * network, you can use a large value here. Once you * write out the whole file, you can shrink it * to the correct size by calling ftruncate * again */ ret = ftruncate(ctx->fd,filelen); mm = mmap(NULL, header->filelen, PROT_READ | PROT_WRITE, MAP_SHARED, ctx->fd, 0); if (NULL == mm) errx(1, "mmap() problem"); memcpy(mm + off, buf, len); off += len; /* Please don't forget to free mmap(2)ed memory! */ munmap(mm, filelen); close(fd); /****************************************** * mmap(2) file read * * * *****************************************/ fd = open(filename, O_RDONLY, 0); if ( -1 == fd) errx(1, " File read err"); /* NOT REACHED */ fstat(fd, &statbf); filelen = statbf.st_size; mm = mmap(NULL, filelen, PROT_READ, MAP_SHARED, fd, 0); if (NULL == mm) errx(1, "mmap() error"); /* NOT REACHED */ /* Now onward you can straightaway * do a memory copy of the mm pointer as it * will dish out file data to you */ bufptr = mm + off; /* You can straightaway copy mmapped memory into the network buffer for sending */ memcpy(pkt.buf + filenameoff, bufptr, bytes); /* Please don't forget to free mmap(2)ed memory! */ munmap(mm, filelen); close(fd);
TCP sockets under Linux come with a rich set of options with which you can manipulate the functioning of the OS TCP/IP stack. A few options are important for performance, such as the TCP send and receive buffer sizes:
sndsize = 16384; setsockopt(socket, SOL_SOCKET, SO_SNDBUF, (char *)&sndsize, (int)sizeof(sndsize)); rcvsize = 16384; setsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char *)&rcvsize, (int)sizeof(rcvsize));
I am using conservative values here. Obviously, it should be much higher for Gigabit networks. These values are determined by the bandwidth delay product. Interestingly, I have never found this to be an issue, so I doubt if this would give you a performance boost. It still is worth mentioning, because the TCP window size alone can give you optimal throughput.
Other options can be set using the /proc pseudo-filesystem under Linux (including the above two), and unless your Linux distribution turns off certain options, you won't have to tweak them.
It is also a good idea to enable PMTU (Path Maximum Transmission Unit) discovery to avoid IP fragmentation. IP fragmentation can affect not just performance, but surely it's more important regarding performance than anything else. To avoid fragmentation at any cost, several HTTP servers use conservative packet sizes. Doing so is not a very good thing, as there is a corresponding increase in protocol overhead. More packets mean more headers and wasted bandwidth.
Instead of using write(2) or send(2) for transfer, you could use the sendfile(2) system call. This provides substantial savings in avoiding redundant copies, as bits are passed between the file descriptor and socket descriptor directly. Be aware that this approach is not portable across UNIX.
Applications should be well designed to take full advantage of network resources. First and foremost, using multiple short-lived TCP connections between the same two endpoints for sequential processing is wrong. It will work, but it will hurt performance and cause several other headaches as well. Most notably, the TCP TIME_WAIT state has a timeout of twice the maximum segment lifetime. Because the round-trip time varies widely in busy networks and networks with high latency, oftentimes this value will be inaccurate. There are other problems too, but if you design your application well, with proper protocol headers and PDU boundaries, there never should be a need to use different TCP connections.
Take the case of SSH, for instance. How many different TCP streams are multiplexed with just one connection? Take a cue from it.
You don't have to work in lockstep between the client and the server. Simply because the protocols and algorithms are visualized in a fixed sequence does not imply that the implementation should follow suit.
You can make excellent use of available bandwidth by doing things in parallel—by not waiting for processing to complete before reading the next packet off the network. Figure 2 illustrates what I mean.
Pipelining is a powerful technique employed in CPUs to speed up the FETCH-DECODE-EXECUTE cycle. Here, we use the same technique for network processing.
Obviously, your wire protocol should have the least overhead and should work without relying much on future input. By keeping the state machine fairly self-contained and isolated, you can process efficiently.
Avoiding redundant protocol headers or fields that are mostly empty or unused can save you precious bandwidth for carrying real data payloads. Header fields should be aligned at 32-bit boundaries and so should the C structures that represent them.
If your application already is in production and you want to enhance its performance, try some of the above techniques. It shouldn't be too much trouble to attack the problem of re-engineering an application if you take it one step at a time. And remember, never trust any theory—not even this article. Test everything for yourself. If your testing does not report improved performance, don't do it. Also, make sure your test cases take care of LAN, WAN and, if necessary, satellite and wireless environments.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- On Your Marks, Get Set...Gutsy Gibbon!
- Profiles and RC Files
- Astronomy for KDE
- Understanding Ceph and Its Place in the Market
- Git 2.9 Released
- Snappy Moves to New Platforms
- SoftMaker FreeOffice
- OpenSwitch Finds a New Home
- The Giant Zero, Part 0.x
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide