Developing P2P Protocols across NAT
Network address translators (NATs) are something every software engineer has heard of, not to mention networking professionals. NAT has become as ubiquitous as the Cisco router in networking terms.
Fundamentally, a NAT device allows multiple machines to communicate with the Internet using a single globally unique IP address, effectively solving the scarce IPv4 address space problem. Though not a long-term solution, as originally envisaged in 1994, for better or worse, NAT technology is here to stay, even when IPv6 addresses become common. This is partly because IPv6 has to coexist with IPv4, and one of the ways to achieve that is by using NAT technology.
This article is not so much a description of how a NAT works. There already is an excellent article on this subject by Geoff Huston (see the on-line Resources). It is quite comprehensive, though plenty of other resources are available on the Internet as well.
This article discusses a possible solution to solving the NAT problem for P2P protocols.
NAT breaks the Internet more than it makes it. I may sound harsh here, but ask any peer-to-peer application developer, especially the VoIP folks, and they will tell you why.
For instance, you never can do Web hosting behind a NAT device. At least, not without sufficient tweaking. Not only that, you cannot run any service such as FTP or rsync or any public service through a NAT device. This can be solved by obtaining a globally unique IP address and configuring the NAT device to bypass traffic originating from that particular IP.
But, the particularly hairy issue with NATed IP addresses is that you can't access machines behind a NAT, simply because you won't even know that a NAT exists in between. By and large, NAT is designed to be transparent, and it remains so. Even if you know there is a NAT device, NAT will let traffic reach the appropriate private IP only if there is mapping between the private IP/TCP or UDP port number with the NAT's public IP/TCP or UDP port number. And, this mapping is created only when traffic originates from the private IP to the Internet—not vice versa.
To make things more complicated, NAT simply drops all unsolicited traffic coming from the Internet to the private hosts. Though this feature arguably adds a certain degree of security through obscurity, it creates more problems than it solves, at least from the perspective of the future of the Internet.
At least 50% of the most commonly used networking applications use peer-to-peer technology. Common examples include instant messaging protocols, VoIP applications, such as Skype, and the BitTorrent download accelerator. In fact, peer-to-peer traffic is only going to increase as time progresses, because the Internet has a lot more to offer beyond the traditional client/server paradigm.
Peer-to-peer technology, by definition, is a mesh network as opposed to a star network in a client/server model. In a peer-to-peer network, all nodes act simultaneously as client and server. This already leads to programming complexity, and peer-to-peer nodes also have to deal somehow with the problematic NAT devices in between.
To make things even more difficult for P2P application developers, there is no standardized NAT behavior. Different NAT devices behave differently. But, the silver lining is that a large portion of the NAT devices in existence today still behave sensibly enough at least to let peer-to-peer UDP traffic pass through.
Sending TCP traffic across a NAT device also has met with success, though you may not be as lucky as with UDP. In this article, we focus purely on UDP, because TCP NAT traversal still remains rather tricky. UDP NAT traversal also is not completely reliable across all NAT devices, but things are very encouraging now and will continue to get better as NAT vendors wake up to the need for supporting P2P protocols.
Incidentally, voice traffic is better handled by UDP, so that suits us fine. Now that we have a fairly good idea of the problem we are trying to solve, let's get down to the solution.
The key to the NAT puzzle lies in the fact that in order for machines behind a NAT gateway to interact with the public Internet, NAT devices necessarily have to allow inbound traffic—that is, replies to requests originating from behind the NAT device. In other words, NAT devices let traffic through to a particular host behind a NAT device, provided the traffic is indeed a reply to a request sent by the NAT device. Now, as mentioned above, NAT devices vary widely in operation, and they let through replies coming from other hosts and port numbers, depending on their own notion of what a reply means.
Our job is simple if we understand this much—that instead of connecting directly to the host behind NAT, we somehow need to mimic a scenario in which the target host originates a connection to us and then we connect to it as though we are responding to the request. In other words, our connection request to the target host should seem like a reply to the NAT device.
It turns out that this technique is easy to achieve using a method now widely known as UDP hole punching. Contrary to what the name suggests, this does not leave a gaping security hole or anything of the sort; it is simply a perfectly sensible and effective way to solve the NAT problem for peer-to-peer protocols.
In a nutshell, what UDP hole punching does already has been explained. Now if it were only that, life would be too simple, and you would not be reading this article. As it turns out, there are plenty of obstacles on the way, but none of them are too complicated.
First is the issue of how to get the private host to originate traffic so we can send our connection request to it masquerading as a reply. To make things worse, NAT devices also have an idle timer, typically of around 60 seconds, such that they stop waiting for replies once a request originates and no reply comes within 60 seconds. So, it is not enough that the private host originate traffic, but also we have to act fast—we have to send the “reply” before the NAT device removes the “association” with the private host, which will frustrate our connection attempt.
Now, a reply obviously has to come from the original machine to which the request was sent. This suits us fine if we are not behind another NAT device. So, if we want to talk to a private IP, we make the private IP send a packet to us, and we send our connection request as a reply to it. But, how do we inform the private IP to send a packet to us when we want to talk to it?
If both the peer-to-peer hosts are behind different NAT devices, is it possible at all to communicate with each other? Fortunately, it is possible.
It turns out that NAT devices are somewhat forgiving, and they differ in their levels of leniency when it comes to interpreting what they consider as reply to a request. There are different varieties of NAT behavior:
Full cone NAT
Restricted cone NAT
Restricted port NAT
I won't go into the details and definitions of these here, as there are numerous resources explaining them elsewhere. Symmetric NATs are the most formidable enemy for P2P applications. However, with a degree of cleverness, we can reasonably “guess” the symmetric NAT behavior and deal with it—well, not all symmetric NATs, but many of them can be tamed to allow P2P protocols.
First, how do we tell the private IP that we are interested in connecting to it at a particular instance?
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- ServersCheck's Thermal Imaging Camera Sensor
- The Italian Army Switches to LibreOffice
- Linux Mint 18
- Petros Koutoupis' RapidDisk
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
- Privacy and the New Math
- Ben Rady's Serverless Single Page Apps (The Pragmatic Programmers)
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide