Inside the Linux Packet Filter
Let's take a closer look at the netif_rx() function. As mentioned before, this function has the task of receiving a packet from a network driver and queuing it for upper-layer processing. It acts as a single gathering point for all the packets collected by the different network card drivers, providing input to the upper protocols' processing.
Since this function runs in interrupt context (that is, its execution flow follows the interrupt service path) with other interrupts disabled, it has to be quick and short. It cannot perform lengthy checks or other complex tasks since the system is potentially losing packets while netif_rx() runs. So, what this function does is basically select the packet queue from an array called softnet_data, whose index is based on the CPU currently running. It then checks the status of the queue, identifying one of five possible congestion levels: NET_RX_SUCCESS (no congestion), NET_RX_CN_LOW, NET_RX_CN_MOD, NET_RX_CN_HIGH (low, moderate and high congestion, respectively) or NET_RX_DROP (packet dropped due to critical congestion).
Should the critical congestion level be reached, netif_rx() engages a throttling policy that allows the queue to go back to a noncongested status, avoiding service disruption due to kernel overload. Among other benefits, this helps avert possible DOS attacks.
Under normal conditions, the packet is finally queued (__skb_queue_tail()), and __cpu_raise_softirq(cpuid, NET_IF_SOFTIRQ) is called. The latter function has the effect of scheduling a softirq for execution.
The netif_rx() function terminates, returning a value indicating the current congestion level to the caller. At this point, interrupt context processing is done, and the packet is ready to be taken care of by upper-layer protocols. This processing is deferred to a later time, when interrupts will have been re-enabled and execution timing will not be as critical. The deferred execution mechanism has changed radically from kernel versions 2.2 (where it was based on bottom halves) to versions 2.4 (where it is based on softirqs).
Explaining in detail about bottom halves (BHs) and their evolution is out of the scope of this article. But, some points are worth recalling briefly.
First off, their design was based on the principle that the kernel should perform as few computations as possible while in interrupt context. Thus, when long operations were to be done in response to an interrupt, the corresponding driver would mark the appropriate BH for execution, without actually doing anything complex. Then, at a later time, the kernel would have checked the BH mask to determine whether some BHs were marked for execution and execute them before any application-level task.
BHs worked quite well, with one important drawback: due to their structure, their execution was serialized strictly among CPUs. That is, the same BH could not be executed by more than one CPU at the same time. This obviously prevented any kind of kernel parallelism on SMP machines and seriously affected performance. softirqs represent the 2.4-age evolution of BHs and, together with tasklets, belong to the family of kernel software interrupts, pieces of code that can be executed by the kernel when requested, without strict response-time guarantees.
The major difference with respect to BHs is that the same softirq may be run on more than one CPU at a time. Serialization, if required, now must be obtained explicitly by using kernel spinlocks.
softirq's processing core is performed in the do_softirq() routine, located in kernel/softirq.c. This function checks a bit mask, and if the bit corresponding to a given softirq is set, it calls the appropriate handling routine. In the case of NET_RX_SOFTIRQ, the one we are interested in at this time, the relevant function is net_rx_action(), located in net/core/dev.c. The do_softirq() function may get called from three distinct places inside the kernel: do_IRQ(), in kernel/irq.c, which is the generic interrupt handler; system calls' exit point, in kernel/entry.S; and schedule(), in kernel/sched.c, which is the main process scheduling function.
In other words, execution of a softirq may happen either when a hardware interrupt has been processed, when an application-level process invokes a system call or when a new process is scheduled for execution. This way, softirqs are drained frequently enough that none of them will lie waiting for their turn for too long.
The trigger mechanism also was exactly the same for the old-style bottom halves.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SuperTuxKart 0.9.2 Released
- Google's SwiftShader Released
- Doing for User Space What We Did for Kernel Space
- Parsing an RSS News Feed with a Bash Script
- SourceClear Open