The Linux Socket Filter: Sniffing Bytes over the Network
The PF_PACKET family allows an application to retrieve data packets as they are received at the network card level, but still does not allow it to read packets that are not addressed to its host. As we have seen before, this is due to the network card discarding all the packets that do not contain its own MAC address—an operation mode called nonpromiscuous, which basically means that each network card is minding its own business and reading only the frames directed to it. There are three exceptions to this rule: a frame whose destination MAC address is the special broadcast address (FF:FF:FF:FF:FF:FF) will be picked up by any card; a frame whose destination MAC address is a multicast address will be picked up by cards that have multicast reception enabled and a card that has been set in promiscuous mode will pick up all the packets it sees.
The last case is, of course, the most interesting one for our purposes. To set a network card to promiscuous mode, all we have to do is issue a particular ioctl() call to an open socket on that card. Since this is a potentially security-threatening operation, the call is only allowed for the root user. Supposing that “sock” contains an already open socket, the following instructions will do the trick:
strncpy(ethreq.ifr_name,"eth0",IFNAMSIZ); ioctl(sock, SIOCGIFFLAGS, ðreq); ethreq.ifr_flags |= IFF_PROMISC; ioctl(sock, SIOCSIFFLAGS, ðreq);
(where ethreq is an ifreq structure, defined in /usr/include/net/if.h). The first ioctl reads the current value of the Ethernet card flags; the flags are then ORed with IFF_PROMISC, which enables promiscuous mode and are written back to the card with the second ioctl.
Let's see it in a more complete example (see Listing 2 at ftp://ftp.linuxjournal.com/pub/lj/listings/issue86/). If you compile and run it as root on a machine connected to a LAN, you will be able to see all the packets flowing on the cable, even if they are not for your host. This is because your network card is now working in promiscuous mode. You can easily check it out by giving the ifconfig command and observing the third line in the output.
Note that if your LAN uses Ethernet switches instead of hubs, you will see only packets flowing in the switch's branch you belong to. This is due to the way switches work, and there is very little you can do about it (except for deceiving the switch with MAC address-spoofing, which is outside the scope of this article). For more information on hubs and switches, please have a look at the articles cited in the Resources section.
All our sniffing problems seem to be solved right now, but there is still one important thing to consider: if you actually tried the example in Listing 2, and if your LAN serves even a modest amount of traffic (a couple of Windows hosts will be enough to waste some bandwidth with a good number of NETBIOS packets), you will have noticed our sniffer prints out too much data. As network traffic increases, the sniffer will start losing packets since the PC will not be able to process them quickly enough.
The solution to this problem is to filter the packets you receive, and print out information only on those you are interested in. One idea would be to insert an “if statement” in the sniffer's source; this would help polish the output of the sniffer, but it would not be very efficient in terms of performance. The kernel would still pull up all the packets flowing on the network, thus wasting processing time, and the sniffer would still examine each packet header to decide whether to print out the related data or not.
The optimal solution to this problem is to put the filter as early as possible in the packet-processing chain (it starts at the network driver level and ends at the application level, see Figure 3). The Linux kernel allows us to put a filter, called an LPF, directly inside the PF_PACKET protocol-processing routines, which are run shortly after the network card reception interrupt has been served. The filter decides which packets shall be relayed to the application and which ones should be discarded.
In order to be as flexible as possible, and not to limit the programmer to a set of predefined conditions, the packet-filtering engine is actually implemented as a state machine running a user-defined program. The program is written in a specific pseudo-machine code language called BPF (for Berkeley packet filter), inspired by an old paper written by Steve McCanne and Van Jacobson (see Resources). BPF actually looks like a real assembly language with a couple of registers and a few instructions to load and store values, perform arithmetic operations and conditionally branch.
The filter code is run on each packet to be examined, and the memory space into which the BPF processor operates are the bytes containing the packet data. The result of the filter is an integer number that specifies how many bytes of the packet (if any) the socket should pass to the application level. This is a further advantage, since often you are interested in just the first few bytes of a packet, and you can spare processing time by avoiding copying the excess ones.