Network Buffers and Memory Management
It is necessary for the high level protocols to append low level headers to each frame before queuing it for transmission. It is also clearly undesirable that the protocol know in advance how to append low level headers for all possible frame types. Thus, the protocol layer calls down to the device with a buffer that has at least dev->hard_header_len bytes free at the start of the buffer. It is then up to the network device to correctly call skb_push() and to put the header on the packet using the dev->hard_header() method. Devices with no link layer header, such as SLIP, may have this method specified as NULL.
The method is invoked by giving the buffer concerned, the device's pointers, its protocol identity, pointers to the source and destination hardware addresses and the length of the packet to be sent. As the routine can be called before the protocol layers are fully assembled, it is vital that the method use the length parameter, not the buffer length.
The source address can be NULL to mean “use the default address of this device”, and the destination can be NULL to mean “unknown”. If as a result of an unknown destination, the header can not be completed, the space should be allocated and any bytes that can be filled in should be filled in. The function must then return the negative of the bytes of header added. This facility is currently only used by IP when ARP processing must take place. If the header is completely built, the function must return the number of bytes of header added to the beginning of the buffer.
When a header cannot be completed the protocol layers will attempt to resolve the necessary address. When this situation occurs, the dev->rebuild_header() method is called with the address at which the header is located, the device in question, the destination IP address and the network buffer pointer. If the device is able to resolve the address by whatever means available (normally ARP), then it fills in the physical address and returns 1. If the header cannot be resolved, it returns 0 and the buffer will be retried the next time the protocol layer has reason to believe resolution will be possible.
There is no receive method in a network device, because it is the device that invokes processing of such events. With a typical device, an interrupt notifies the handler that a completed packet is ready for reception. The device allocates a buffer of suitable size with dev_alloc_skb(), and places the bytes from the hardware into the buffer. Next, the device driver analyses the frame to decide the packet type. The driver sets skb->dev to the device that received the frame. It sets skb->protocol to the protocol the frame represents, so that the frame can be given to the correct protocol layer. The link layer header pointer is stored in skb->mac.raw, and the link layer header removed with skb_pull() so that the protocols need not be aware of it. Finally, to keep the link and protocol isolated, the device driver must set skb->pkt_type to one of the following:
PACKET_BROADCAST Link layer broadcast
PACKET_MULTICAST Link layer multicast
PACKET_SELF Frame to us
PACKET_OTHERHOST Frame to another single host
This last type is normally reported as a result of an interface running in promiscuous mode.
Finally, the device driver invokes netif_rx() to pass the buffer up to the protocol layer. The buffer is queued for processing by the networking protocols after the interrupt handler returns. Deferring the processing in this fashion dramatically reduces the time interrupts are disabled and improves overall responsiveness. Once netif_rx() is called, the buffer ceases to be property of the device driver and can not be altered or referred to again.
Flow control on received packets is applied at two levels by the protocols. First, a maximum amount of data can be outstanding for netif_rx() to process. Second, each socket on the system has a queue which limits the amount of pending data. Thus, all flow control is applied by the protocol layers. On the transmit side a per device variable dev->tx_queue_len is used as a queue length limiter. The size of the queue is normally 100 frames, which is large enough that the queue will be kept well filled when sending a lot of data over fast links. On a slow link such as a slip link, the queue is normally set to about 10 frames, as sending even 10 frames is several seconds of queued data.
One piece of magic that is done for reception with most existing devices, and one that you should implement if possible, is to reserve the necessary bytes at the head of the buffer to land the IP header on a long word boundary. The existing Ethernet drivers thus do:
skb=dev_alloc_skb(length+2); if(skb==NULL) return; skb_reserve(skb,2); /* then 14 bytes of ethernet hardware header */
to align IP headers on a 16 byte boundary, which is also the start of a cache line and helps give performance improvements. On the SPARC or DEC Alpha these improvements are very noticeable.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Back to Backups
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- A New Version of Rust Hits the Streets
- Google's Abacus Project: It's All about Trust
- Secure Desktops with Qubes: Introduction
- Seeing Red and Getting Sleep
- Fancy Tricks for Changing Numeric Base
- Secure Desktops with Qubes: Installation
- Working with Command Arguments
- CentOS 6.8 Released
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide