Overcoming Asymmetric Routing on Multi-Homed Servers
Let's now see the results of this technique play out during a real Web serving test. The test consists of transferring a 90KB file 20,000 times. The HTTP transactions are load-balanced across the server's two IP addresses, with an average of 40 connections being performed in parallel.
The ifconfig command reports on an interface's packet counters. Listing 2 shows the output of the ifconfig command after running the test on a vanilla Web server that does not employ the source-based routing approach.
Listing 2. Interface Counters with Destination-Based Routing
eth0 Link encap:Ethernet HWaddr 00:E1:AA:7C:51:2C
inet addr:192.168.16.20 Bcast:192.168.16.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:328008 errors:0 dropped:0 overruns:0 frame:0
TX packets:1341151 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:23963417 (22.8 Mb) TX bytes:1908125938 (1819.7 Mb)
Interrupt:19 Base address:0xe400 Memory:dff80000-dffa0000
eth1 Link encap:Ethernet HWaddr 00:E1:AA:7C:51:2D
inet addr:192.168.16.21 Bcast:192.168.16.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:346430 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:25250075 (24.0 Mb) TX bytes:0 (0.0 b)
Interrupt:16 Base address:0xec00 Memory:dffa0000-dffc0000
The server's received traffic, which consists of HTTP requests and TCP acknowledgments for the HTTP responses, is well balanced at roughly 330,000 packets received by each interface. However, the transmission traffic has fallen prey to the asynchronous route problem: interface eth0 has transmitted 1.3 million packets where eth1 has not transmitted any.
Listing 3 contains the output of ifconfig after rebooting the server to clear the interface counters and employing the iproute2 strategy discussed in this article. The test then was run again in the same manner as above.
Listing 3. Interface Counters with Policy Based Routing
eth0 Link encap:Ethernet HWaddr 00:E1:AA:7C:51:2C
inet addr:192.168.16.20 Bcast:192.168.16.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:332371 errors:0 dropped:0 overruns:0 frame:0
TX packets:670341 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:24270910 (23.1 Mb) TX bytes:954045844 (909.8 Mb)
Interrupt:19 Base address:0xe400 Memory:dff80000-dffa0000
eth1 Link encap:Ethernet HWaddr 00:E1:AA:7C:51:2D
inet addr:192.168.16.21 Bcast:192.168.16.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:334110 errors:0 dropped:0 overruns:0 frame:0
TX packets:670152 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:24387875 (23.2 Mb) TX bytes:954032082 (909.8 Mb)
Interrupt:16 Base address:0xec00 Memory:dffa0000-dffc0000
The server's received traffic remains well balanced, but the transmission traffic now is equalized at 670,000 packets for each interface.
Source-based routing capabilities are common on high end networking gear, but they rarely are seen or utilized in server environments. Linux has excellent but poorly understood source-based routing support. The whole universe of advanced Linux routing and traffic shaping is well described at lartc.org.
Resources
Effects of Network Asymmetry on TCP Performance: www.eecs.berkeley.edu/IPRO/Summary/97abstracts/padmanab.1.html
Linux Advanced Routing and Traffic Control: www.lartc.org
Patrick McManus (mcmanus@ducksong.com) works as a software engineer for Datapower Technology, near his home in Boston, Massachusetts. He currently is obsessed with reading a biography of each American president.
- « first
- ‹ previous
- 1
- 2
- 3
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- New Products
- The Pari Package On Linux
- New Products
- Dart: a New Web Programming Experience
- This is the easiest tutorial
2 hours 25 min ago - Ahh, the Koolaid.
8 hours 3 min ago - git-annex assistant
14 hours 3 min ago - direct cable connection
14 hours 25 min ago - Agreed on AirDroid. With my
14 hours 36 min ago - I just learned this
14 hours 40 min ago - enterprise
15 hours 10 min ago - not living upto the mobile revolution
18 hours 1 min ago - Deceptive Advertising and
18 hours 37 min ago - Let\'s declare that you have
18 hours 38 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
Local Traffic
I just ran into this today when fixing some routes. If you want those two interfaces to send traffic normally on their local network ( 192.168.16.0/24 ) without going through the gateway and forming an asymmetric route with hosts on that network you'll need to add:
#ip route add 192.168.16.0/24 dev eth0 tab 1
#ip route add 192.168.16.0/24 dev eth1 tab 2
to use link routing on the local subnet.
Ubuntu ip route commands - what file do I put them in?
So, I tried /etc/network/if-up.d/ip and /etc/rc.local, but all routing breaks when the box reboots. Where should I put these? Currently, I let the box boot up, then run the commands manually and everything works great. Any suggestions?
1. vi
1.
vi /etc/init.d/iproutes-asym and add the commands you need in there
chmod 755 /etc/init.d/iproutes-asym
2.
cd /etc/rc3.d
ln -s ../init.d/iproutes S99z-iproutes-asym
this is what my iproutes-asym file looks like
ip route add default via 10.53.1.252 dev eth0 tab 1
ip route add default via 10.53.1.252 dev eth1 tab 2
ip rule add from 10.53.1.55/32 tab 1 priority 500
ip rule add from 10.53.1.54/32 tab 2 priority 600
ip route flush cache
Muchas gracias
Thanks for putting this together. Proper routing on a multi-homed server is poorly documented by my Linux distro vendor. Your article was a great help in understanding iproute2 (in this context) and getting things working properly.
solutions
Network interface level problem can be solved with bonding too and it's easier to manage. iproute2 can be used to have multiple loadbalancers and/or gateways though.
Need some HELP for linux asymmetric routing
Hello Friends! I have two ISP-Links from the same Service-Provider. I got for each link an IP-Address on Subnet /30. eth0 runs on x.x.24.66, and eth1 on x.x.24.234.
The default-route is set to x.x.24.233 dev eth1. Now, when a ICMP-Ping reached by x.x.24.234 on eth1, ping will be responded. When a ping reached by x.x.24.66 on eth0, nothing happens.
The ICMP-Ping-Request pass the eth0-interface, but will not be responded via eth1 (default-route)... When i listen on eth1 with tcpdump, there no outgoing-packets to handle ICMP-Responses.
Whats the problem?
Thanks, Mike.
http://www.michaelrack.de
Thank you! Also..
Patrick,
Thank you! I have been struggling with this for weeks. I wish I had found this article first. This is the first time I have found a good explanation of rules and tables and their relationship in the same place.
Regarding SNAT. I listed two source addresses in my iptables firewall.. it mostly works well. However, some outbound connections fail - most noteably SSH, Yahoo IM, IRC all reset after a short time (though web traffic seems ok). I can SNAT to one of my outbound addresses and use an ip rule to designate a single gateway. This works, but I am no longer NAT load balancing over my two WAN links. Anyone know a solution?
-Nathan
Thank you
Thank you very much for this excellent article
Best wishes
Super
Very nice and educative article. Good reading.
Re: Overcoming Asymmetric Routing on Multi-Homed Servers
Minimalist load balancer. From lartc.org section 4.2.2
# ip route default nexthop via gw_1 nexthop via gw_2
Mohammad Bahathir Hashim
Malaysia.
rules vs. nat
What about the SNAT target in iptables? It modifies the source IP address of the packet, but applies only in the POSTROUTING chain. Are the rules (the policy) evaluated *after* that again? The name POSTROUTING makes me think the routing part is already over...
Re: rules vs. nat
If it's anything like a Cisco router, outbound NAT happens after policy routing, and doesn't get another chance at the policy engine.
Sean
Re: rules vs. nat
The SNAT target allows you to specify multiple source ip's and they will be used one after the other. That would probably give you simple outbound load-balancing.
From the iptables man page:
You can add several --to-source options. If you specify more than one source address, either via an address range or multiple --to-source options, a simple round-robin (one after another in cycle) takes place between these addresses.
L2 vs L3...
Great article on policy routing, but isn't this problem what bonding was designed to solve?
http://linux-ip.net/html/ether-bonding.html
/usr/src/linux-2.4/Documentation/networking/bonding.txt
Sean
Re: L2 vs L3...
From the link you posted:
" Bonding for link aggregation must be supported by both endpoints."
"Bonding for link aggregation
"Bonding for link aggregation must be supported by both endpoints."
Sounds like something our marriage therapist once told my (now ex-) wife and I... ;) Needless to say, it was NOT supported by *both* endpoints!