Dissecting Interrupts and Browsing DMA

This is the fourth in a series of articles about writing character device drivers as loadable kernel modules. This month, we further investigate the field of interrupt handling. Though it is conceptually simple, practical limitations and constraints make this an “interesting” part of device friver writing, and several different facilities have been provided for different situations. We also invest

Though last month's installment appeared to cover everything about interrupts, that is not true. One month later, you are ready to understand the ultimate technology for interrupt handling. We also begin to investigate the fascinating world of memory management by explaining the tasks of a DMA-capable driver.

Changes in Current Linux Versions

Before we start, you should note two changes in recent Linux versions. In Linux 1.3.70, the first steps were taken to support shared interrupts. The idea is that several devices and device drivers share the same interrupt line. This is also part of the PCI specification, where every device has its own vendor- and product-dependent device ID. When reading this ID, Linux gets quite verbose about plugged-in PCI devices when booting with PCI enabled. For this reason, the declaration of request_irq and free_irq in linux/sched.h now reads:

extern int request_irq(unsigned int irq,
   void (*handler)(int, void *, struct pt_regs *),
   unsigned long flags,
   const char *device,
   void *dev_id);
extern void free_irq(unsigned int irq, void *dev_id);

Thus, when registering an interrupt with request_irq(), a fourth parameter must be handed to Linux: the device ID. Currently, most devices will pass a NULL for dev_id when requesting and freeing interrupts—this results in the same behaviour as before. If you really want to share interrupts, you also have to set SA_SHIRQ in flags, in addition to passing dev_id. Sharing of interrupts only works if all device drivers on one interrupt line agree to share their interrupt.

The second “change” is not a real change, but rather, a stylistic upgrade: get_user_byte() and put_user_byte() are considered obsolete and should not be used in new code. They are replaced by the more flexible get_user and put_user calls.

Linus Torvalds explains that these functions use compiler “magic” to look at the pointer they are passed and read or write the correct size item. This means you can't use void * or unsigned long as a pointer; you always have to use a pointer to the right type. Also, if you give it a char *, you get a char back, not an unsigned char, unlike the old get_fs_byte() function. If you need an unsigned value, use a pointer to an unsigned type. Never cast the return value to get the access size you want—if you think you need to do that, you are doing something wrong. Your argument pointer should always be of the right type.

Essentially, you should think of get_user() as a simple pointer dereference (kind of like *(xxx) in plain C, except it fetches the pointer data from user space instead). In fact, on some architectures, that is exactly what it is.

While we're fixing previous oversights, it is worth noting that the kernel offers a facility to autodetect interrupt lines. That's slightly different from what was shown in our article a couple of months ago. Those who are interested can look at <linux/interrupt.h> for documentation about it.

We now return you to your regularly scheduled programming.

A Split View of Interrupts

As you may remember from last time, interrupt handling is managed by a single driver function. Our implementation dealt with both low-level (acknowledging the interrupt) and high-level (such as awakening tasks) issues. This way of doing things may work for simple drivers, but it is prone to failure if the handling is too slow.

If you look at the code, it is clear that acknowledging the interrupt and retrieving or sending the relevant data is a minimal part of the workings. With common devices where you are moving only one or a few bytes per interrupt, most of the time is spent in managing device-specific queues, chains, and whatever other strange data structures are used in your implementation. Don't take our skel_interrupt() as an example here, since it is the most simplified interrupt handler possible; real devices may have a lot of status information and many modes of operation. If you spend too much time processing your data structures, it is possible to miss the next interrupt or interrupts and either hang or lose data, because when an interrupt handler is running, at least that interrupt is blocked, and with fast interrupt handlers, all interrupts are blocked.

The solution devised for this problem is to split the task of interrupt handling into two halves:

  • First, a fast “top half” handles hardware issues and must terminate before a new interrupt is issued. Normally, very little is done here besides moving data between the device and some memory buffer (and not even that, if your device driver uses DMA), and making sure the hardware is in a sane state.

  • Then a slower “bottom half” of the handler runs with interrupts enabled and can take as long as is needed to accomplish everything. It is executed as soon as possible after the interrupt is serviced.

Fortunately, the kernel offers a special way to schedule execution of the bottom half code, which isn't necessarily related to a particular process; this means both the request to run the function and the execution of the function itself are done outside of the context of any process. A special mechanism is needed, because the other kernel functions all operate in the context of a process—an orderly thread of execution, normally associated with a running instance of a user-level program—while interrupt handling is asynchronous and not related to a particular process.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix