Point-to-Point Linux

This financial firm decided to build its own redundant WAN routers. Here's a no-nonsense look at the tricky parts and how it all worked.
Tweaking the Router Configuration

The goal of using a Linux router is to minimize disk reads and writes. This is necessary because the memory in a Flash drive can be written to only a fixed number of times, typically in the hundreds of thousands. The way to minimize writes is to treat the router as though it were a laptop. First, enable laptop mode in the kernel, as described in the September 2004 issue of Linux Journal. This causes the system to delay writes until a read is requested instead of sending writes to the disk as soon as they occur.

Second, adjust your filesystem mounting options to delay writes as well. For ext3, set the commit interval to 60 seconds. Then, mount the filesystem with the noatime option so that reads on files don't generate a write of modified access times.

Third, move all log files off the drive and into a RAM-based filesystem, tmpfs. Listing 2 shows how to restructure your filesystem to move all log files out of /var and into a tmpfs called /var/impermanent. For this to be really useful, you also need a script, such as the one in Listing 3, that saves all the log files in a tarball on system shutdown and restores them on boot [Listings 2 and 3 also are available on the LJ FTP site]. This script should be called as early as possible in /etc/rc.d/rc.sysinit on startup and as late as possible during shutdown in /etc/init.d/halt.

Configuring the WAN Links

WAN links are confusing! For example, the T3 and T1 drivers use different versions of the kernel HDLC stack. This means we have to keep two different versions of the sethdlc program used to set the protocol on the WAN circuit, one built against each hdlc stack.

There are many configuration parameters to set on a T3 or T1 circuit—external or internal timing? CRC size? HDLC mode? and so on. Fortunately, SBE's tech support was helpful and supplied many configuration and troubleshooting tips.

We decided to bond the four T1s into one bonded circuit, using teql. This worked, but performance was terrible if one of the T1s was removed, even after it was reconnected. My coworker, Bill Rugolsky, tracked the problem down to a lack of link status reporting. The SBE card could report whether the link was up or down, but this message was not propagated up the stack. Thus, teql continued to try to send out packets using down interfaces. Bill resolved this by patching the SBE driver and installing patches others had created to fix teql and linkwatch notification. The driver patches were provided to SBE, and we hope they are included in the next revision of their driver.

Our boss Andy Schorr did the work to set up OSPF to handle routing over the WAN links. The open-source package Quagga, a successor to Zebra, provides the necessary framework. If one of the links goes down (remember, there are two links, the T3 and the virtual link over the bonded T1s), Quagga detects this and starts routing packets over the other interface. Traditionally, point-to-point links are configured to borrow the address of another interface, typically eth0. However, we decided to use dedicated subnets for each point-to-point link. Andy had to modify the source code to make Quagga work properly in this setup.

We also had to figure out some iptables rules to make Quagga work correctly with the bonded T1s. The teql device is send-only, so packets never appear on it. This causes Quagga to drop the packets, because they come in on the wrong interface. The fix is a couple of iptables rules to make packets arriving on all the T1 interfaces (hdlc0 through hdlc3) appear on teql0:

iptables -t mangle -A PREROUTING -i hdlc\+ -j TTL --ttl-inc 1
iptables -t mangle -A PREROUTING -i hdlc\+ -j ROUTE --iif teql0

The bottom line is that setting up WAN links is tricky work and requires much study and tweaking. Don't expect things to work simply because you connect the cables.

Obstacles along the Way

We had to resolve a number of problems while configuring these WAN routers. Some of the earlier ones were with the WAN drivers, as mentioned above. As I was writing this article, we discovered that our T1 performance had deteriorated badly, with highly variable ping times—up to 1 second instead of the usual 10ms. We traced this to one of the WAN cards not generating interrupts; it had come loose in the PCI slot. The widely varying packet delays were occurring because the other device sharing the same interrupt line (eth0) was sending interrupts. This, in turn, caused the SBE driver to wake up and process its interrupts. This type of non-obvious failure highlights the importance of link-quality monitoring.