IP Masquerading with Linux
It seems everyone wants on the Internet nowadays, and for good reason. There is plenty of information to obtain, people to send e-mail to, web pages to look at and software to download. Besides that, businesses are finding acceptable means of advertising, and in some cases, selling merchandise. But with all the rush to get on the Internet, people are finding Internet addresses are not as readily available as they once were. Some network administrators are experiencing that in many environments; they don't have enough network addresses to meet the demand.
Instead of going through the motions of obtaining another block or two of class C addresses, some administrators hide a set of unregistered addresses behind a network address translation (NAT) device. The Internet is prepared for these “private” addresses, and blocks of addresses are reserved for this purpose. RFC 1597 specifies the addresses 10.0.0.0 through 10.255.255.255, 172.16.0.0 through 172.31.255.255, and 192.168.0.0 through 192.168.255.255 to be used in these instances.
The RFC strongly recommends that if you, as a network administrator, are going to use a private address, you should select addresses from the ranges given. One notably important reason is that if a packet happens to pass through the NAT with its original IP address intact, the backbone routers on the Internet will not forward the packet. If, instead, you were using someone else's valid IP address, confusion could occur.
Many firewalls, especially those based on application proxy gateways, naturally hide addresses because of how they function. It is no surprise that Linux can also support address hiding through what is called “IP masquerading”. Setting up masquerading under Linux is not terribly difficult, but there are some subtleties to point out.
If you are running kernel version 1.2.x, you need to obtain the kernel patch to support masquerading. The patch is available from ftp://ftp.eves.com/pub/masq, or you can download everything you need from www.indyramp.com/masq/. IP masquerading is supported with 1.3.x kernel versions. For this article, I was running version 1.3.56, and all examples are based on this version. For FTP support (mentioned later), you need to have at least kernel version 1.3.37. There is a patch for 1.2.x (where x >= 4) kernels to support FTP, but I haven't tested it yet. The masqplus-0.4 “jumbo” patch that is available from Indyramp fixes a few bugs and adds support for FTP, RealAudio, and fragmentation for 1.2.13 kernels.
When configuring the kernel to support masquerading, it is important to also say yes to firewall and forwarding support. Here are the parameters I used for configuring my kernel:
Network firewalls (CONFIG_FIREWALL) [Y/n/?] y Network aliasing (CONFIG_NET_ALIAS) [Y/n/?] y TCP/IP networking (CONFIG_INET) [Y/n/?] y IP: forwarding/gatewaying (CONFIG_IP_FORWARD) [Y/n/?] y IP: multicasting (CONFIG_IP_MULTICAST) [Y/n/?] y IP: firewalling (CONFIG_IP_FIREWALL) [Y/n/?] y IP: accounting (CONFIG_IP_ACCT) [Y/n/?] y IP: tunneling (CONFIG_NET_IPIP) [Y/m/n/?] y eP: firewall packet logging (CONFIG_IP_FIREWALL_VERBOSE) [Y/n/?] y IP: masquerading (ALPHA) (CONFIG_IP_MASQUERADE) [Y/n/?] y
I chose other items not directly related to masquerading such as multicast and tunneling, but I like to have fun.
Notice the IP masquerading software is still considered to be Alpha-quality. This means there are probably still some bugs. The base functionality is there, but not all of the nuances of TCP, UDP, and IP, nor the application protocols, have been thoroughly tested. In addition, the interface may still change as development proceeds.
In order to manipulate the masquerading ruleset, you will need the ipfw software version 1.3.6-BETA3, or you can obtain a precompiled binary from ftp.eves.com. Those who use Linux as a filtering firewall and also use ipfwadm should note that software does not yet support IP masquerading, so ipfw is necessary. [New: ipfwadm 2.0beta2, now available for Linux 1.3.66 and newer from ftp://ftp.xos.nl/pub/linux/ipfwadm/, does support masquerading. Also, it is necessary to use recent versions of ipfwadm with the most recent versions of the kernel due to interface changes—ED]
Let's first define what we're trying to accomplish and see how IP masquerading is useful in the environment. Figure 1 shows the networks on which the examples are based. deathstar is the Linux machine employing masquerading in order to hide the network 192.168.1.0.
Masquerading is useful in our architecture because it saves us a little administrative hassle. A number of people in my department have home LANs, and through their PPP connection they can use their other machines to connect to the department lab. We could easily run a routing protocol, like RIP, to make the machines on the lab network aware of the home LANs, but that would take some coordination about who has what network address. It is easier (for us) to use masquerading.
To hide the network, we can issue the command:
# ipfw a m all from 192.168.1.0/24 to 0.0.0.0/0
This rule indicates that we want to add a masquerading rule for all protocols (which in this case means TCP and UDP). The network we are hiding is 192.168.1.0, and we are hiding connections going to any network (0.0.0.0/0). The /24 indicates we are applying a 24-bit netmask, or 255.255.255.0. Since we specified the network as 192.168.1.0, deathstar will masquerade for all hosts on the network. That's all we need to do.
If I had only wanted deathstar to masquerade for enterprise, then I would have typed in:
# ipfw a m all from 192.168.1.2/32 to 0.0.0.0/0
But what does it really mean “to masquerade for”? Well, let's examine the affected files and kernel tables for a typical masqueraded connection. We'll use telnet for our example.
Let's verify the rule has been set. We need to look at the ip_forward file in the /proc/net directory. We can use ipfw to do this:
# ipfw -n list forward Type Proto From To Ports (masqueradeall 192.168.1.0/24 anywhere
This is good. Some administrators mistakenly look in the /proc/net/ip_masquerade file for the rule and when they don't see it, confusion sets in.
For our example, I've started a telnet session from warbird to enterprise. Also, on mccoy, I'm using the tcpdump program to monitor the traffic on 220.127.116.11 and sparcbook to monitor the traffic on 192.168.1.0. We can now look at the ip_masquerade file to examine what is happening (see Listing 1).
Let's decode this stuff. First, the earliest packet is at the bottom. It is a DNS request (therefore UDP) from 192.168.1.2 to 18.104.22.168. mccoy is warbird's DNS server in this case. The Masq column shows us the port on deathstar that is used for the masquerading. For the first DNS request, it is port 60000 (EA60). After the DNS resolution, the TCP connection is established on the next available port over 60000, 60001. Figure 2 illustrates the protocol time-line for the sequence of events up to the TCP open.
Even though the protocol time-line shows how the packets really traverse, the sending and receiving nodes are unaware of this. Hence, the reason they call it masquerading. From warbird's point of view, the traffic will look exactly as expected. That is, packets from enterprise are repackaged by deathstar to look as if they came from enterprise. Listing 2 shows the tcpdump output of the traffic on the 192.168.1.0 network for the telnet session.
Listing 3 shows the protocol traffic on the 22.214.171.124 network during the telnet session. Notice that information originates from deathstar, not warbird. (Another thing you might notice is I don't keep the clocks synchronized very well.)
Another important aspect is maintaining the TCP synchronization numbers. For masquerading to work properly, deathstar must keep the synchronization correct. The TCP sequence number generated by warbird is forwarded by deathstar rather than a new sequence number being generated.
Some final observations about the contents of the /proc/net/ip_masquerade file pertain to the last four fields. The Init-seq, Delta, and PDelta fields deal with the TCP synchronization numbers when ftp data transfers (more in a minute) occur, and the last field is the expiration timer on the masquerade entry. The time is kept in hundredths of seconds; TCP is given 90000 or 15 minutes, and UDP is given 300000 or 5 minutes. As long as traffic is being passed between the two communicating hosts for the masked port, the timer will remain updated. A minor detail about the expiration timer has to do with FTP transfers. FTP uses two connections: a control connection for commands and a data connection for a file transfer. While the data connection is in use for data movement, the control connection will sit idle. If the transfer takes longer than 15 minutes, the masquerading host will close the control connection. The data connection will go to completion, but you will have to reconnect if you want to get more files. This is controlled by the definitions:
#define MASQUERADE_EXPIRE_TCP 15*60*HZ #define MASQUERADE_EXPIRE_TCP_FIN 2*60*HZ #define MASQUERADE_EXPIRE_UDP 5*60*HZ
in the file /usr/include/linux/ip_fw.h. Six hours (360 minutes) seems to be a relatively acceptable timeout value, but change it as you see fit.
Not all protocols work with IP masquerading. ICMP messages (such as those used by ping) will not be passed through the masquerading host. Also, application protocols that pass their address to the receiving host will not work. The talk program is an example of this.
A major exception to the applications that don't work is ftp. The IP masquerading software has been written to handle file transfers as of kernel version 1.3.37. FTP clients, under normal operation, will send the server the address and port number to which the server should connect for a transfer. This shouldn't work with masquerading for the same reasons that talk fails. However, the IP masquerading software will intercept the FTP PORT command and masquerade as the client host awaiting for the server to connect to it.
The biggest problem is the most subtle one: IP fragmentation. Fragmentation occurs automatically within the Internet Protocol. IP always wants to fit a datagram in the frame size of the network link it is transmitting over. Most data links define a Maximum Transmission Unit (MTU) of information that will fit within one frame. If the IP datagram to be sent out can't all fit into the MTU size of the frame, it will be fragmented.
An IP datagram carrying a TCP segment is structured like the “Original Datagram” illustration in Figure 3. After fragmentation, the new datagrams appear (also shown in Figure 3). The most important aspect to notice is the placement of the TCP header. With fragmentation, it only appears in the first fragment and not in succeeding ones. Without the header, the host doing the masquerading has no way of determining whether the fragment should be forwarded. The same applies for fragmented UDP packets.
With TCP, this problem is mostly avoided because of TCP's MSS (Maximum Segment Size) negotiation. That's not to say it won't happen, but it doesn't occur most of the time. UDP, however, is much more susceptible to this type of behavior. Your only solution as an administrator is to be careful about controlling MTU sizes on SLIP or PPP networks.
Other problems also exist for X applications (connections back to the X server); RealAudio (patches available, however); and rlogin (rlogind requires a privileged port).
Actual troubleshooting of masquerading problems is not always as easy as getting the rules straight. One subscriber to the IP masquerading mailing list (see Sidebar) presented an interesting problem. It was solved with simple analysis, code knowledge, and a good hex editor.
Greg Priem sent a message to the IP masquerading mailing list describing a problem in which his telnet sessions would freeze. He isolated a sequence of events that reproduced the problem—he would log into his service provider's main host from a machine behind his Linux box and type in ls -l.
Greg did some initial analysis and posted what he found. The network he was using is illustrated in Figure 4. The telnets were from the Mac to the ISP and other hosts on the Internet. He noticed telnets from the Mac to the Linux Box worked fine, as well as telnets from the Linux Box to the ISP.
Output from tcpdump revealed fragmentation was taking place. I followed up with a message indicating a possible problem and asked Greg to check the MTU sizes on each interface of the Linux Box.
I thought it strange that fragmentation was occurring on a telnet session since telnet uses TCP. As mentioned before, when TCP opens a connection, the MSS negotiation is supposed to eliminate fragmentation.
Further debugging with tcpdump (a handy program) showed the MTU assigned by the ISP was 212. To try to eliminate fragmentation, the SLIP link was also assigned an MTU of 212 by Greg. When looking at the MSS negotiation of the connections, Greg found that from the Linux box to the ISP, the MSS was set to 172, and from the Mac to the Linux box it was the same. However, a connection from the Mac to the ISP showed an MSS of 536.
Given that information, I was able to deduce the problem and respond with an appropriate solution.
The connection scenarios are given in Figure 5.
One thing to note was the MSS advertisement of 536 from the Mac when it had an immediate link with an MTU smaller than that. BSD-experienced people will remember this number from the networking code that chose an MSS value for TCP's negotiation by seeing if the destination was on the local LAN or a remote LAN. The code roughly looked like:
if dest_net == local_net then mss = (link MTU) - 40 else mss = 536 /* determined by 576 - 40 */ fi
If the destination was on a remote network, it would set the MSS automatically to 536. This was a good number because the RFC for IP stated that the default datagram size for internetworking is 576, meaning every device should be able to handle it without further fragmentation. Forty is subtracted to allow for IP and TCP headers.
A second thing to notice was the Linux box forwarding the MSS advertisement. One might think that since a connection is being made from the Linux box as a consequence of masquerading, the MSS value would be based on the network link from the Linux box and not the original value from the sending host.
As an aside, there was the one unexplainable instance of connections made to the ISP host and the ISP sending back an MSS of 1460, as shown at the bottom of Figure 5. It's strange because it was also connected to the PPP link with an MTU of 212. This may be attributed to a lack of knowledge on the ISP's side of the network.
Since both sides were using an MSS value greater than the MTU of either link, there was bound to be fragmentation, even for a TCP connection. Under normal circumstances, this wouldn't matter, but it does confuse masquerading.
The simple solution was to have the ISP support an MTU of at least 576 and for Greg to set the SLIP link with an MTU of 576 or greater. Therefore, no fragmentation would occur.
Greg e-mailed his ISP and waited for an answer. When none arrived he became impatient. Since he didn't have the source to the TCP code on the Mac, the only way to look at it was with a hex editor. He started poking around to see if he could find the BSD-like code where it made the decision for the MSS, and sure enough, he found it. He changed the hard coded values of 536 to 172 (i.e. 212-40), restarted his Mac, and lo and behold, it worked—no more fragmentation! (By the way, the ISP did change the MTU size later.) His approach was a little more daring than what I would have done, but it seems to be the nature of Linux users to patch an existing binary if they can't recompile something.
IP masquerading is an interesting technology, but more importantly, it serves a very useful function for many Internet environments. It works well for common services such as telnet, http, and ftp, but it does not support everything. ICMP messages, talk, remote X applications, and rlogin do not work with masquerading. Fortunately, the software is still in its Alpha versions, and more development is being pursued.
Chris Kostick (email@example.com) is a Senior Computer Scientist at Computer Sciences Corporation's Network Security Department. He enjoys working with Linux but considers himself a latecomer because he started out at kernel version 1.1.18. As far as computers go, he's not sure if he has more fun debugging TCP/IP problems or shooting DOS machines