High-Performance Networking Programming in C
Listing 4. mmap.c
/****************************************** * mmap(2) file write * * * *****************************************/ caddr_t *mm = NULL; fd = open (filename, O_RDWR | O_TRUNC | O_CREAT, 0644); if(-1 == fd) errx(1, "File write"); /* NOT REACHED */ /* If you don't do this, mmapping will never * work for writing to files * If you don't know file size in advance as is * often the case with data streaming from the * network, you can use a large value here. Once you * write out the whole file, you can shrink it * to the correct size by calling ftruncate * again */ ret = ftruncate(ctx->fd,filelen); mm = mmap(NULL, header->filelen, PROT_READ | PROT_WRITE, MAP_SHARED, ctx->fd, 0); if (NULL == mm) errx(1, "mmap() problem"); memcpy(mm + off, buf, len); off += len; /* Please don't forget to free mmap(2)ed memory! */ munmap(mm, filelen); close(fd); /****************************************** * mmap(2) file read * * * *****************************************/ fd = open(filename, O_RDONLY, 0); if ( -1 == fd) errx(1, " File read err"); /* NOT REACHED */ fstat(fd, &statbf); filelen = statbf.st_size; mm = mmap(NULL, filelen, PROT_READ, MAP_SHARED, fd, 0); if (NULL == mm) errx(1, "mmap() error"); /* NOT REACHED */ /* Now onward you can straightaway * do a memory copy of the mm pointer as it * will dish out file data to you */ bufptr = mm + off; /* You can straightaway copy mmapped memory into the network buffer for sending */ memcpy(pkt.buf + filenameoff, bufptr, bytes); /* Please don't forget to free mmap(2)ed memory! */ munmap(mm, filelen); close(fd);
TCP sockets under Linux come with a rich set of options with which you can manipulate the functioning of the OS TCP/IP stack. A few options are important for performance, such as the TCP send and receive buffer sizes:
sndsize = 16384; setsockopt(socket, SOL_SOCKET, SO_SNDBUF, (char *)&sndsize, (int)sizeof(sndsize)); rcvsize = 16384; setsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char *)&rcvsize, (int)sizeof(rcvsize));
I am using conservative values here. Obviously, it should be much higher for Gigabit networks. These values are determined by the bandwidth delay product. Interestingly, I have never found this to be an issue, so I doubt if this would give you a performance boost. It still is worth mentioning, because the TCP window size alone can give you optimal throughput.
Other options can be set using the /proc pseudo-filesystem under Linux (including the above two), and unless your Linux distribution turns off certain options, you won't have to tweak them.
It is also a good idea to enable PMTU (Path Maximum Transmission Unit) discovery to avoid IP fragmentation. IP fragmentation can affect not just performance, but surely it's more important regarding performance than anything else. To avoid fragmentation at any cost, several HTTP servers use conservative packet sizes. Doing so is not a very good thing, as there is a corresponding increase in protocol overhead. More packets mean more headers and wasted bandwidth.
Instead of using write(2) or send(2) for transfer, you could use the sendfile(2) system call. This provides substantial savings in avoiding redundant copies, as bits are passed between the file descriptor and socket descriptor directly. Be aware that this approach is not portable across UNIX.
Applications should be well designed to take full advantage of network resources. First and foremost, using multiple short-lived TCP connections between the same two endpoints for sequential processing is wrong. It will work, but it will hurt performance and cause several other headaches as well. Most notably, the TCP TIME_WAIT state has a timeout of twice the maximum segment lifetime. Because the round-trip time varies widely in busy networks and networks with high latency, oftentimes this value will be inaccurate. There are other problems too, but if you design your application well, with proper protocol headers and PDU boundaries, there never should be a need to use different TCP connections.
Take the case of SSH, for instance. How many different TCP streams are multiplexed with just one connection? Take a cue from it.
You don't have to work in lockstep between the client and the server. Simply because the protocols and algorithms are visualized in a fixed sequence does not imply that the implementation should follow suit.
You can make excellent use of available bandwidth by doing things in parallel—by not waiting for processing to complete before reading the next packet off the network. Figure 2 illustrates what I mean.
Pipelining is a powerful technique employed in CPUs to speed up the FETCH-DECODE-EXECUTE cycle. Here, we use the same technique for network processing.
Obviously, your wire protocol should have the least overhead and should work without relying much on future input. By keeping the state machine fairly self-contained and isolated, you can process efficiently.
Avoiding redundant protocol headers or fields that are mostly empty or unused can save you precious bandwidth for carrying real data payloads. Header fields should be aligned at 32-bit boundaries and so should the C structures that represent them.
If your application already is in production and you want to enhance its performance, try some of the above techniques. It shouldn't be too much trouble to attack the problem of re-engineering an application if you take it one step at a time. And remember, never trust any theory—not even this article. Test everything for yourself. If your testing does not report improved performance, don't do it. Also, make sure your test cases take care of LAN, WAN and, if necessary, satellite and wireless environments.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Back to Backups
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- A New Version of Rust Hits the Streets
- Google's Abacus Project: It's All about Trust
- Secure Desktops with Qubes: Introduction
- Seeing Red and Getting Sleep
- Fancy Tricks for Changing Numeric Base
- Secure Desktops with Qubes: Installation
- Working with Command Arguments
- Linux Mint 18
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide