Block Device Drivers: Interrupts

Last month, we gave an introduction to block device drivers. This month, we look at some tricks that are useful when writing block device drivers, starting with the most basic “trick” of using hardware interrupts where available and describing some neat infrastructure that block device drivers can take advantage of by adding five lines of code and one function.
Automatic Timeouts

In blk.h, a mechanism for timing out when hardware doesn't respond is provided. If the foo device has not responded to a request after 5 seconds have passed, there is very clearly something wrong. We will update blk.h again:

#elif (MAJOR_NR == FOO_MAJOR)
#define DEVICE_NAME "foobar"
#define DEVICE_REQUEST do_foo_request
#define DEVICE_INTR do_foo
#define DEVICE_TIMEOUT FOO_TIMER
#define TIMEOUT_VALUE 500
/* 500 == 5 seconds */
#define DEVICE_NR(device) (MINOR(device) > 6)
#define DEVICE_ON(device)
#define DEVICE_OFF(device)
#endif

This is where using SET_INTR() and CLEAR_INTR becomes helpful. Simply by defining DEVICE_TIMEOUT, SET_INTR is changed to automatically set a “watchdog timer” that goes off if the foo device has not responded after 5 seconds, SET_TIMER is provided to set the watchdog timer manually, and a CLEAR_TIMER macro is provided to turn off the watchdog timer. The only three other things that need to be done are to:

  1. Add a timer, FOO_TIMER, to linux/timer.h. This must be a #define'd value that is not already used and must be less than 32 (there are only 32 static timers).

  2. In the foo_init() function called at boot time to detect and initialize the hardware, a line must be added:

    timer_table[FOO_TIMER].fn = foo_times_out;
    
  3. And (as you may have guessed from step 2) a function foo_times_out() must be written to try restarting requests, or otherwise handling the time out condition.

The foo_times_out() function should probably reset the device, try to restart the request if appropriate, and should use the CURRENT->errors variable to keep track of how many errors have occurred on that request. It should also check to see if too many errors have occurred, and if so, call end_request(0) and go on to the next request.

Exactly what steps are required depend on how the hardware device behaves, but both the hd and the floppy drivers provide this functionality, and by comparing and contrasting them, you should be able to determine how to write such a function for your device. Here is a sample, loosely based on the hd_times_out() function in hd.c:

static void hd_times_out(void)
{
   unsigned int dev;
   SET_INTR(NULL);
   if (!CURRENT)
      /* completely spurious interrupt-
         pretend it didn't happen. */
      return;
   dev = DEVICE_NR(CURRENT->dev);
#ifdef DEBUG
   printk("foo%c: timeout\n", dev+'a');
#endif
   if (++CURRENT->errors >= FOO_MAX_ERRORS) {
#ifdef DEBUG
      printk("foo%c: too many errors\n", dev+'a');
#endif
      /* Tell buffer cache: couldn't fulfill request */
      end_request(0);
      INIT_REQUEST;
   }
   /* Now try the request again */
   foo_initialize_io();
}

SET_INTR(NULL) keeps this function from being called recursively. The next two lines ignore interrupts that occur when no requests have been issued. Then we check for excessive errors, and if there have been too many errors on this request, we abort it and go on to the next request, if any; if there are no requests, we return. (Remember that the INIT_REQUEST macro causes a return if there are no requests left.)

At the end, we are either retrying the current request or have given up and gone on to the next request, and in either case, we need to re-start the request.

We could reset the foo device right before calling foo_initialize_io(), if the device maintains some state and needs a reset. Again, this depends on the details of the device for which you are writing the driver.

Stay Tuned...

Next month, we will discuss optimizing block device drivers.

Other Resources

Michael K. Johnson is the editor of Linux Journal, and is also the author of the Linux Kernel Hackers' Guide (the KHG). He is using this column to develop and expand on the KHG.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState