The Devil's in the Details

This article, the third of five on writing character device drivers, introduces concepts of reading, writing, and using ioctl-calls.

Starting from the clean code environment of the two previous articles, we now turn to all the nasty interrupt stuff. Astonishingly, Linux hides most of this from us, so we do not need a single line of assembler...

Reading and writing

Right now, our magic skel-machine driver can load and even unload (painlessly, unlike in DOS), but it has neither read nor written a single character. So we will start fleshing out the skel_read() and skel_write() functions introduced in the previous article (under fops and filp). Both functions take four arguments:

Static int skel_read (struct inode *inode,
                      struct file *file,
                      char *buf, int count)
Static int skel_write (struct inode *inode,
                       struct file *file,
                       const char *buf,
                       int count)

The inode structure supplies the functions with information used already during the skel_open() call. For example, we determined from inode->i_rdev which board the user wants to open, and transferred this data—along with the board's base address and interrupt to the private_data entry of the file descriptor. We might ignore this information now, but if we did not use this hack, inode is our only chance to find out to which board we are talking.

The file structure contains data that is more valuable. You can explore all the elements in its definition in <linux/fs.h>. If you use the private_data entry, you find it here, and you should also make use of the f_flags entry—revealing to you, for instance, if the user wants blocking or non-blocking mode. (We explain this topic in more detail later on.)

The buf argument tells us where to put the bytes read (or where to find the bytes written) and count specifies how many bytes there are. But you must remember that every process has its own private address space. In kernel code, there is an address space common to all processes. When system calls execute on behalf of a specific process, they run in kernel address space, but are still able to access the user space. Historically, this was done through assembler code using the fs register; current Linux kernels hide the specific code within functions called get_user_byte() for reading a byte from user address space, put_user_byte() for writing one, and so on. They were formerly known as get_fs_byte, and only memcpy_tofs() and memcpy_fromfs() reveal these old days even on a DEC Alpha. If you want to explore, look in <asm/segment.h>.

Let us imagine ideal hardware that is always hungry to receive data, reads and writes quickly, and is accessed through a simple 8-bit data-port at the base address of our interface. Although this example is unrealistic, if you are impatient you might try the following code:

Static int skel_read (struct inode *inode,
                      struct file *file,
                      char *buf, int count) {
    int n = count;
    char *port = PORT0 ((struct Skel_Hw*)
    while (n--) {
        Wait till device is ready
        put_user_byte (inb_p (port), buf);
    return count;

Notice the inb_p() function call, which is the actual I/O read from the hardware. You might decide to use its fast equivalent, inb(), which omits a minimal delay some slow hardware might need, but I prefer the safe way.

The equivalent skel_write() function is nearly the same. Just replace the put_user_byte() line by the following:

        outb_p (get_user_byte (buf), port);

However, these lines have a lot of disadvantages. What using them causes Linux to loop infinitely while waiting for a device that never becomes ready? Our driver should dedicate the time in the waiting loop to other processes, making use of all the resources in our expensive CPU, and it should have an input and output buffer for bytes arriving while we are not in skel_read() and corresponding skel_write() calls. It should also contain a time-out test in case of errors, and it should support blocking and non-blocking modes.

Blocking and Non-Blocking Modes

Imagine a process that reads 256 bytes at a time. Unfortunately, our input buffer is empty when skel_read() is called. So what should it do—return and say that there is no data yet, or wait until at least some bytes have arrived?

The answer is both. Blocking mode means the user wants the driver to wait till some bytes are read. Non-blocking mode means to return as soon as possible—just read all the bytes that are available. Similar rules apply to writing: Blocking mode means “Don't return till you can accept some data,” while non-blocking mode means: “Return even if nothing is accepted.” The read() and write()calls usually return the number of data bytes successfully read or written. If, however, the device is non-blocking and no bytes can be transferred, -EAGAIN is typically returned (meaning: “Play it again, Sam”). occasionally, old code may return -EWOULDBLOCK, which is the same as -EAGAIN under Linux.

Maybe now you are smiling as happily as I did when I first heard about these two modes. If these concepts are new for you, you might find the following hints helpful. Every device is opened by default in blocking mode, but you may choose non-blocking mode by setting the O_NONBLOCK flag in the open() call. You can even change the behaviour of your files later on with the fcntl() call. The fcntl() call is an easy one, and the man page will be sufficient for any programmer.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Kernel Korner: The Devil's in the Details

Anonymous's picture


When you call wake_up_interruptible () from the handler the control unpends the sleeping task finish the read system call from the and then return back to continue the handler after calling wake_up ? Or else both the wake_up and unpending sleep task goes in parallel.

Free Dummies Books
Continuous Engineering


  • What continuous engineering is
  • How to continuously improve complex product designs
  • How to anticipate and respond to markets and clients
  • How to get the most out of your engineering resources

Get your free book now

Sponsored by IBM

Free Dummies Books
Service Virtualization

Learn to:

  • Define service virtualization
  • Select the most beneficial services to virtualize
  • Improve your traditional approach to testing
  • Deliver higher-quality software faster

Get your free book now

Sponsored by IBM