Highly Available Networking
High availability (HA) means different things to different people. This article defines availability as the percentage of time that a computer system is capable of providing the service that it is assigned to do. A good figure of availability for computer systems that are used for business critical tasks, such as running a telephone switch or enterprise data communication network, is 99.999% of the time (five nines). This translates to less than six minutes per year that the service is not available.
CompactPCI traditionally has been the platform of choice for these five nines systems, because hot swap of components in to and out of a running system is usually also a requirement.
In a highly available network there should be multiple independent paths to each system in the network to avoid single points of failure (SPOF). Physical separation is also a good idea because if both paths are in the same conduit, and the conduit gets cut by accident, the network will go down.
Key to availability is the ability to detect failure quickly and transparently switch from one LAN connection to another. Putting the burden of handling redundancy in the networking driver allows for easier HA hardening of networked applications, as it relieves the application of having to be aware of network topology.
The Linux bonding driver has the ability to detect link failure and reroute network traffic around a failed link in a manner transparent to the application. It also has the ability (with certain network switches) to aggregate network traffic in all working links to achieve higher throughput. This is sometimes referred to as trunking.
The bonding driver accomplishes this by enslaving all of the Ethernet ports in the bond to the same Ethernet MAC address, which ensures the proper routing of packets across the links. With a hub arrangement, there should not be more than one link with the same MAC address active at any one time, so the bonding driver can be set up to have only one channel active at a time. This is called active-backup mode, and it will route all traffic through one channel until it detects a failure, at which point it switches to the next backup channel.
With a switch instead of a hub, it is possible to send traffic over all live links at the same time, effectively aggregating the bandwidth of the available links. This is called the round-robin mode. Round-robin mode provides availability as well as aggregation, but not all switches are capable of supporting aggregation. The bonding documentation (see Resources) contains a list of some switches that do support aggregation. The round-robin mode sends packets over all working links, with each successive packet being sent over the next link in the bonding rotation, effectively aggregating the bandwidth of all usable links.
The program that creates the bond is the ifenslave program. It is similar in function to the ifconfig program that configures nonbonded Ethernet interfaces, except that it configures all members of the bond to the same network configuration (IP, MAC, broadcast addresses, etc.). To configure the bonding driver, use ifconfig to configure the bond0 device, and use ifenslave to configure the members of the bond (the slaves).
Many recent distributions, including the Hard Hat Linux HA Framework 2.0 release, come with bonding and ifenslave already in the distribution. Bonding is available as a patch that contains the bonding driver and the ifenslave program, as well as some other modifications necessary to make the whole package work properly. The driver can be compiled in or run as a module.
Listing 1 shows a typical configuration scenario. The first line installs the bonding driver as a module in active-backup mode with a link-status check period of 100ms. Round-robin mode would use a mode parameter of 0. The first ifconfig sets the IP address for the bonding driver. The next two ifenslave commands enslave eth0 and eth1 to the bond0 device. The bond0 device takes the MAC address of the first slave configured in the bond, and this becomes the MAC address for all devices in the bond.
Listing 1. Typical Configuration Scenario
The networking stack talks to the bond0 device, which sends packets out over whichever slave device is appropriate, given the mode and availability status. In Listing 1, the mode is active-backup, and the active Ethernet device is eth0. Inactive Ethernet slaves have NOARP in the status line.
When a component fails, it is not enough to detect and mask the failure. The failing component must be repaired so that the next failure does not cause loss of service. For an Ethernet cable or hub or switch, it is usually a simple matter of replacing it with a working one. For an Ethernet board in a running computer, it is not always so simple.
The PCI Industrial Computer Manufacturers Group (PICMG) has created a set of standards for CompactPCI hardware and software that make it easier to replace defective hardware in a running system. With PICMG-compliant hardware and the proper drivers and dæmons, replacing a defective board in a running system is a simple matter of removing the defective board and replacing it with a working one.
PICMG standard 2.1 is a hardware standard that covers the mechanical and electrical requirements necessary to remove and/or plug in a board in a running system (hot swap). PICMG standard 2.12 is a software standard that covers the driver requirements to handle hot-swap events. The SourceForgePICMG hot-swap site has the hot-swap driver routines and HA dæmon for handling hot swapping.
Hot swap requires additional coordination with drivers and the PCI subsystem to handle PCI devices that come and go. When an Ethernet card fails and the operator wants to remove it, all he or she has to do is open the handle switch on the CompactPCI board, and this sends an ENUM# interrupt to the PICMG 2.12 driver, which calls to the routine registered to receive hot-swap events. This routine is responsible for notifying the driver for the card, removing the device from the kernel PCI tree and turning on the blue hot-swap LED on the board, which indicates to the operator that it is safe to remove the card. It also notifies the HA dæmon so that it can do any user-space actions necessary (such as removing an Ethernet device from a bond or removing a driver that is no longer used).
When a replacement card is inserted, it also causes an ENUM# interrupt, which gets routed to the same routine mentioned above. This routine is then responsible for inserting the device in the kernel PCI tree and notifying the HA dæmon that a new device has been inserted.
Trending Topics
| You Need A Budget | Feb 10, 2012 |
| The Linux powered LAN Gaming House | Feb 08, 2012 |
| Creating a vDSO: the Colonel's Other Chicken | Feb 06, 2012 |
| Your CMS Is Not Your Web Site | Feb 01, 2012 |
| Casper, the Friendly (and Persistent) Ghost | Jan 31, 2012 |
| Razor-qt 0.4 - Qt based Desktop Environment | Jan 30, 2012 |
- Fun with ethtool
- 100% disappointed with the decision to go all digital.
- Parallel Programming with NVIDIA CUDA
- Readers' Choice Awards 2011
- You Need A Budget
- Linux-Based X Terminals with XDMCP
- Validate an E-Mail Address with PHP, the Right Way
- The Linux powered LAN Gaming House
- Why Python?
- Creating a vDSO: the Colonel's Other Chicken





4 hours 41 min ago
5 hours 41 min ago
15 hours 9 min ago
15 hours 19 min ago
21 hours 24 min ago
1 day 48 min ago
1 day 1 hour ago
1 day 2 hours ago
1 day 7 hours ago
1 day 7 hours ago