Linux and the Next Generation Internet
The ability to implement advanced routing behavior using Linux, including those proposed by Diffserv, is provided by the rich set of traffic-control features present in the Linux kernel. Alexey Kuznetsov is the author of these kernel features and the user-space programs used to control them. The architecture of the Linux traffic control features is described nicely by Almesberger (see Resources 7), and the motivation and control of these features is also summarized in an excellent LJ article by Hadi-Salim (see Resources 8). For clarity, we include a brief review of the Linux traffic-control capabilities used in our implementation and our approach to configuring them. In general, to enable “differentiated services” for Linux, first the Linux box has to be able to route IP packets correctly, and several rules for traffic control must then be put in place.
In preparation for use as a Diffserv router, the kernel of the Linux router must be configured to allow the use of advanced routing features. To implement Diffserv-type behaviors effectively, several “subsystems” of the kernel must be available. These subsystems include the routing capabilities of the kernel, the packet scheduling functions, and the netlink functionality to configure the traffic-control modules. The traffic-control functions can be compiled into a monolithic kernel or loaded as modules.
A summary of the pertinent features compiled into our Diffserv routers is shown in Listings 1 and 2. All locations given are representative of the option list given during make menuconfig. You may be checking your kernel configuration menu now, and saying to yourself, “Hmm... I don't see those choices!” That's because you haven't acquired the necessary kernel patch. The web site for “Differentiated Services on Linux” is maintained by Werner Almesberger at the Swiss Federal Institute of Technology (see Resources 9). Here you will find the “Diffserv for Linux” distribution (as of this writing, the current version was ds-6). The distribution comes with a set of patches for both the kernel and for a user-space application to configure traffic-control kernel features (called “tc”). Also included in the distribution is a set of example scripts and some documentation. It is a good idea to acquire a copy of the package iproute2+tc at this point (see Resources 10). The patch from the Linux Diffserv distribution is version-sensitive with iproute2+tc, and since our project took place mainly in the summer of 1999, we used version ss990630 of iproute2+tc.
Once your Linux router has been configured properly (depending on your router's job), you are ready to configure your machine for traffic control.
To enable differentiated services on a Linux router, the traffic-control features must be configured. This configuration is achieved through a user-level program, appropriately named tc (traffic control). The command-line syntax for tc is quite long and complex, so scripts are generally used for configuration. An example tc configuration script is shown in Listing 3. In the listing, tc is being used to configure kernel traffic control for a core router in our Diffserv application. This entails attaching a parent queuing discipline to the applicable interface, then creating the queues for the varying classes of traffic. Finally, filters are created to classify packets into the appropriate classes.
As can be seen in Listing 3, the structure of the tc configuration scripts for a Diffserv-enabled Linux router can be broken down into parts:
Creation of the root queuing discipline. This uses the syntax tc qdisc add followed by several parameters. These parameters describe attributes of this queuing discipline. These parameters include which network interface the queuing discipline is attached to (dev eth3), an identifier for qdisc (handle 1:0), where in the qdisc hierachy to insert this qdisc (root) and which queuing discipline to use (tcindex). The remaining parameters are specific to the particular queuing discipline. Diffserv maps naturally into a class-based queuing scheme. Therefore, each Diffserv router (regardless of job) will employ class-based queuing (CONFIG_NET_SCH_CBQ) to house its various per-hop behaviors.
Creation of classes for each type of per-hop behavior. This uses the syntax tc class add followed by several parameters. These parameters are similar to the tc qdisc add syntax. These parameters will identify which queuing discipline the class belongs to, and other parameters define the behavior of the class. Our demonstration made extensive use of two per-hop behaviors: best effort (BE) and expedited forwarding (EF). The configuration in Listing 3 clearly shows the two sections defining BE and EF PHBs.
Creation of queuing disciplines for each class. Each class must have a queuing discipline to determine how packets are enqueued and dequeued. The syntax for this step is identical to that for step 1. The EF PHB class uses a simple FIFO (first-in, first-out) for its queuing discipline, since we wanted the traffic to get in and out of the class as quickly as possible. The BE PHB class uses a token bucket filter in an attempt to throttle the traffic-generation machines during times of extreme congestion.
Creation of filters (classifiers) to assign marked traffic to the appropriate class. This uses the syntax tc filter add followed by several parameters used to describe which packets are bound for what classes. Our sample script is from a core router. Packets arriving at this interface have already been marked by edge routers. Classifying packets at this step requires matching the TOS (type of service) bits from the IP header to values suggested by the IETF (Internet Engineering Task Force) Differentiated Services workgroup for various per-hop behaviors (denoted by the value following the “mask” parameter). The filter creation varies, based on which job the router fulfills. Core routers solely use the tcindex packet classifier (CONFIG_NET_CLS_TCINDEX) included with the Diffserv distributions. Edge routers use the firewall packet classifier (CONFIG_NET_CLS_FW) along with ipchains.
Complete Diffserv functionality really assumes two different types of routing capabilities: “core” and “edge” routers. With a Linux-based Diffserv implementation, “edge” routers use ipchains to handle their tasks. Replacing the application ipfwadm from earlier kernels, ipchains is a user-space program that configures the firewalling functionalities of Linux kernels 2.1.x and higher. Configuring ipchains has been well-documented in this magazine (see Resources 11) and other arenas, and is beyond the scope of this document. Our Linux Diffserv testbed uses ipchains to assign handles to incoming traffic based on IP address rules. These handles are then used by a filter (classifier) installed with tc (the user-space application) to replace the current IP TOS bytefield setting with the appropriate Diffserv field marking (DSCP). This method proved to be very effective. Dynamic configuration was easily attainable, and the speed of ipchains held up to very high demand. Even though ipchains will be superceded by iptables in future versions of the Linux kernel (see Resources 12), the functionality will be very similar. So, the approach we've used will still be applicable.
The specific scripts used to provide Diffserv capability in our testbed environment are available at ftp://cter.eng.uab.edu/Diffserv/.