Kernel Korner - The New Work Queue Interface in the 2.6 Kernel

Interrupt latency is a key factor in the performance of a system. Work queues are one of several tools available to the driver writer to avoid doing time-consuming work when interrupts are disabled.
Creating New Worker Threads

In rare cases, the default worker threads may be insufficient. Thankfully, the work queue interface allows you to create your own worker threads and use those to schedule your bottom-half work. To create new worker threads, invoke the function:

struct workqueue_struct *
create_workqueue(const char *name)

For example, on system initialization, the kernel creates the default queues with:

keventd_wq = create_workqueue("events");

This function creates all of the per-processor worker threads. It returns a pointer to a struct workqueue_struct, which is used to identify this work queue from other work queues (such as the default one). Once you create the worker thread, you can queue work in a fashion similar to how work is queued with the default worker thread:

int queue_work(struct workqueue_struct *wq,
               struct work_struct *work)

Here, wq is a pointer to the specific work queue that you created using the call to create_workqueue(), and work is a pointer to your work queue structure. Alternatively, you can schedule work with a delay:

queue_delayed_work(struct workqueue_struct *wq,
                   struct work_struct *work,
                   unsigned long delay)

This function works the same as queue_work(), except it delays the queuing of the work for delay jiffies. These two functions are analogous to schedule_work() and schedule_delayed_work(), except they queue the given work into the given work queue instead of the default one. Both functions return nonzero on success and zero on failure. Both functions may be called from both interrupt and process context.

Finally, you may flush a specific work queue with the function:

void flush_workqueue(struct workqueue_struct *wq)

This function waits until all queued work on the wq work queue has completed before returning.


The work queue interface has been a part of the kernel since 2.5.41. In that time, a large number of drivers and subsystems have made it their method of deferring work. But is it the right bottom half for you? If you need to run your bottom half in process context, a work queue is your only option. Furthermore, if you are considering creating a kernel thread, a work queue may be a better choice. But what if you do not need a bottom half that can sleep? In that case, you may find tasklets are a better choice. They also are easy to use, but they do not run in a kernel thread. Because they are not run in process context, no context switch overhead is associated with their execution; therefore, they may offer you less overhead.

Robert Love is a kernel hacker involved in various projects. He is a mathematics and computer science student at the University of Florida and a kernel engineer at MontaVista Software. He can be reached at



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A Simple Introduction to Device Drivers under Linux

Lokesh Venkateshiah's picture

Since the misty days of yore, the first step in learning a new programming language has been writing a program that prints "Hello, world!" (See the Hello World Collection for a list of more than 300 "Hello, world!" examples.) In this article, we will use the same approach to learn how to write simple Linux kernel modules and device drivers. We will learn how to print "Hello, world!" from a kernel module three different ways: printk(), a /proc file, and a device in /dev.

Preparation: Installing Kernel Module Compilation Requirements

For the purposes of this article, a kernel module is a piece of kernel code that can be dynamically loaded and unloaded from the running kernel. Because it runs as part of the kernel and needs to interact closely with it, a kernel module cannot be compiled in a vacuum. It needs, at minimum, the kernel headers and configuration for the kernel it will be loaded into. Compiling a module also requires a set of development tools, such as a compiler. For simplicity, we will briefly describe how to install the requirements to build a kernel module using Debian, Fedora, and the "vanilla" Linux kernel in tarball form. In all cases, you must compile your module against the source for the running kernel (the kernel executing on your system when you load the module into your kernel).

A note on kernel source location, permissions, and privileges: the kernel source customarily used to be located in /usr/src/linux and owned by root. Nowadays, it is recommended that the kernel source be located in a home directory and owned by a non-root user. The commands in this article are all run as a non-root user, using sudo to temporarily gain root privileges only when necessary. To setup sudo, see the sudo(8), visudo(8), and sudoers(5) main pages. Alternatively, become root, and run all the commands as root if desired. Either way, you will need root access to follow the instructions in this article.

Preparation for Compiling Kernel Modules Under Debian

The module-assistant package for Debian installs packages and configures the system to build out-of-kernel modules. Install it with:

$ sudo apt-get install module-assistant
That's it; you can now compile kernel modules. For further reading, the Debian Linux Kernel Handbook has an in-depth discussion on kernel-related tasks in Debian.

Fedora Kernel Source and Configuration

The kernel-devel package for Fedora has a package that includes all the necessary kernel headers and tools to build an out-of-kernel module for a Fedora-shipped kernel. Install it with:

$ sudo yum install kernel-devel
Again, that's all it takes; you can now compile kernel modules. Related documentation can be found in the Fedora release notes.

Vanilla Kernel Source and Configuration

If you choose to use the vanilla Linux kernel source, you must configure, compile, install, and reboot into your new vanilla kernel. This is definitely not the easy route and this article will only cover the very basics of working with vanilla kernel source.

The canonical Linux source code is hosted at The most recent stable release is linked to from the front page. Download the full source release, not the patch. For example, the current stable release is located at For faster download, find the closest mirror from the list at, and download from there. The easiest way to get the source is using wget in continue mode. HTTP is rarely blocked, and if your download is interrupted, it will continue where it left off.

$ wget -c ""
Unpack the kernel source:

$ tar xjvf linux-.tar.bz2
Now your kernel is located in linux-/. Change directory into your kernel and configure it:

$ cd linux-
$ make menuconfig
A number of really nice make targets exist to automatically build and install a kernel in many forms: Debian package, RPM package, gzipped tar, etc. Ask the make system for help to list them all:

$ make help
A target that will work on almost every distro is:

$ make tar-pkg
When finished building, install your new kernel with:

$ sudo tar -C / -xvf linux-.tar
Then create a symbolic link to the source tree in the standard location:

$ sudo ln -s /lib/modules/'uname -r'/build
Now the kernel source is ready for compiling external modules. Reboot into your new kernel before loading modules compiled against this source tree.

"Hello, World!" Using printk()

For our first module, we'll start with a module that uses the kernel message facility, printk(), to print "Hello, world!". printk() is basically printf() for the kernel. The output of printk() is printed to the kernel message buffer and copied to /var/log/messages (with minor variations depending on how syslogd is configured).

Download the hello_printk module tarball and extract it:

$ tar xzvf hello_printk.tar.gz
This contains two files: Makefile, which contains instructions for building the module, and hello_printk.c, the module source file. First, we'll briefly review the Makefile.

obj-m := hello_printk.o
obj-m is a list of what kernel modules to build. The .o and other objects will be automatically built from the corresponding .c file (no need to list the source files explicitly).

KDIR := /lib/modules/$(shell uname -r)/build
KDIR is the location of the kernel source. The current standard is to link to the associated source tree from the directory containing the compiled modules.

PWD := $(shell pwd)
PWD is the current working directory and the location of our module source files.

$(MAKE) -C $(KDIR) M=$(PWD) modules
default is the default make target; that is, make will execute the rules for this target unless it is told to build another target instead. The rule here says to run make with a working directory of the directory containing the kernel source and compile only the modules in the $(PWD) (local) directory. This allows us to use all the rules for compiling modules defined in the main kernel source tree.

Now, let's run through the code in hello_printk.c.

This includes the header files provided by the kernel that are required for all modules. They include things like the definition of the module_init() macro, which we will see later on.

static int __init
printk("Hello, world!\n");
return 0;
This is the module initialization function, which is run when the module is first loaded. The __init keyword tells the kernel that this code will only be run once, when the module is loaded. The printk() line writes the string "Hello, world!" to the kernel message buffer. The format of printk() arguments is, in most cases, identical to that of printf(3).

The module_init() macro tells the kernel which function to run when the module first starts up. Everything else that happens inside a kernel module is a consequence of what is set up in the module initialization function.

static void __exit
printk("Goodbye, world!\n");

Similarly, the exit function is run once, upon module unloading, and the module_exit() macro identifies the exit function. The __exit keyword tells the kernel that this code will only be executed once, on module unloading.

MODULE_AUTHOR("Valerie Henson ");
MODULE_DESCRIPTION("\"Hello, world!\" minimal module");
MODULE_LICENSE() informs the kernel what license the module source code is under, which affects which symbols (functions, variables, etc.) it may access in the main kernel. A GPLv2 licensed module (like this one) can access all the symbols. Certain module licenses will taint the kernel, indicating that non-open or untrusted code has been loaded. Modules without a MODULE_LICENSE() tag are assumed to be non-GPLv2 and will result in tainting the kernel. Most kernel developers will ignore bug reports from tainted kernels because they do not have access to all the source code, which makes debugging much more difficult. The rest of the MODULE_*() macros provide useful identifying information about the module in a standard format.

Now, to compile and run the code. Change into the directory and build the module:

$ cd hello_printk
$ make
Then, load the module, using insmod, and check that it printed its message, using dmesg, a program that prints out the kernel message buffer:

$ sudo insmod ./hello_printk.ko
$ dmesg | tail
You should see "Hello, world!" in the output from dmesg. Now unload the module, using rmmod, and check for the exit message:

$ sudo rmmod hello_printk
$ dmesg | tail
You have successfully compiled and installed a kernel module!

Hello, World! Using /proc

One of the easiest and most popular ways to communicate between the kernel and user programs is via a file in the /proc file system. /proc is a pseudo-file system, where reads from files return data manufactured by the kernel, and data written to files is read and handled by the kernel. Before /proc, all user-kernel communication had to happen through a system call. Using a system call meant choosing between finding a system call that already behaved the way you needed (often not possible), creating a new system call (requiring global changes to the kernel, using up a system call number, and generally frowned upon), or using the catch-all ioctl() system call, which requires the creation of a special file that the ioctl() operates on (complex and frequently buggy, and very much frowned upon). /proc provides a simple, predefined way to pass data between the kernel and userspace with just enough framework to be useful, but still enough freedom that kernel modules can do what they need.

For our purposes, we want a file in /proc that will return "Hello, world!" when read. We'll use /proc/hello_world. Download and extract the hello_proc module tarball. We'll run through the code in hello_proc.c.

This time, we add the header file for procfs, which includes support for registering with the /proc file system.

The next function will be called when a process calls read() on the /proc file we will create. It is simpler than a completely generic read() system call implementation because we only allow the "Hello, world!" string to be read all at once.

static int
hello_read_proc(char *buffer, char **start, off_t offset, int size, int *eof,
void *data)
The arguments to this function deserve an explicit explanation. buffer is a pointer to a kernel buffer where we write the output of the read(). start is used for more complex /proc files; we ignore it here. offset tells us where to begin reading inside the "file"; we only allow an offset of 0 for simplicity. size is the size of the buffer in bytes; we must check that we don't write past the end of the buffer accidentally. eof is a short cut for indicating EOF (end of file) rather than the usual method of calling read() again and getting 0 bytes back. data is again for more complex /proc files and ignored here.

Now, for the body of the function:

char *hello_str = "Hello, world!\n";
int len = strlen(hello_str); /* Don't include the null byte. */
* We only support reading the whole string at once.
MODULE_DESCRIPTION("\"Hello, world!\" minimal module");
Then, we're ready to compile and load the module:

$ cd hello_proc
$ make
$ sudo insmod ./hello_proc.ko
Now, there is a file named /proc/hello_world that will produce "Hello, world!" when read:

$ cat /proc/hello_world
Hello, world!
You can create many more /proc files from the same driver, add routines to allow writing to /proc files, create directories full of /proc files, and more. For anything more complicated than this driver, it is easier and safer to use the seq_file helper routines when writing /proc interface routines. For further reading, see Driver porting: The seq_file interface.

Hello, World! Using /dev/hello_world

Now we will implement "Hello, world!" using a device file in /dev, /dev/hello_world. Back in the old days, a device file was a special file created by running a crufty old shell script named MAKEDEV which called the mknod command to create every possible file in /dev, regardless of whether the associated device driver would ever run on that system. The next iteration, devfs, created /dev files when they were first accessed, which led to many interesting locking problems and wasteful attempts to open device files to see if the associated device existed. The current version of /dev support is called udev, since it creates /dev links with a userspace program. When kernel modules register devices, they appear in the sysfs file system, mounted on /sys. A userspace program, udev, notices changes in /sys and dynamically creates /dev entries according to a set of rules usually located in /etc/udev/.

Download the hello world module tarball. We'll go through hello_dev.c.

#include <

As we can see from looking at the necessary header files, creating a device requires quite a bit more kernel support than our previous methods. fs.h includes the definitions for a file operations structure, which we must fill out and attach to our /dev file. miscdevice.h includes support for registering a miscellaneous device file. asm/uaccess.h includes functions for testing whether we can read or write to userspace memory without violating permissions.

hello_read() is the function called when a process calls read() on /dev/hello. It writes "Hello, world!" to the buffer passed in the read() call.

static ssize_t hello_read(struct file * file, char * buf,
size_t count, loff_t *ppos)
char *hello_str = "Hello, world!\n";
int len = strlen(hello_str); /* Don't include the null byte. */
* We only support reading the whole string at once.
if (count < len)
return -EINVAL;
* If file position is non-zero, then assume the string has
* been read and indicate there is no more data to be read.
if (*ppos != 0)
return 0;
* Besides copying the string to the user provided buffer,
* this function also checks that the user has permission to
* write to the buffer, that it is mapped, etc.
if (copy_to_user(buf, hello_str, len))
return -EINVAL;
* Tell the user how much data we wrote.
*ppos = len;

return len;

Next, we create the file operations struct defining what actions to take when the file is accessed. The only file operation we care about is read.

static const struct file_operations hello_fops = {
.owner = THIS_MODULE,
.read = hello_read,
Now, create the structure containing the information needed to register a miscellaneous device with the kernel.

static struct miscdevice hello_dev = {
* We don't care what minor number we end up with, so tell the
* kernel to just pick one.
* Name ourselves /dev/hello.
* What functions to call when a program performs file
* operations on the device.
As usual, we register the device in the module's initialization function.

static int __init
int ret;

* Create the "hello" device in the /sys/class/misc directory.
* Udev will automatically create the /dev/hello device using
* the default rules.
ret = misc_register(&hello_dev);
if (ret)
"Unable to register \"Hello, world!\" misc device\n");

return ret;

And remember to unregister the device in the exit function.

static void __exit


MODULE_AUTHOR("Valerie Henson ");
MODULE_DESCRIPTION("\"Hello, world!\" minimal module");
Compile and load the module:

$ cd hello_dev
$ make
$ sudo insmod ./hello_dev.ko
Now there is a device named /dev/hello that will produce "Hello, world!" when read by root:

$ sudo cat /dev/hello
Hello, world!
But we can't read it as a regular user:

$ cat /dev/hello
cat: /dev/hello: Permission denied
$ ls -l /dev/hello
crw-rw---- 1 root root 10, 61 2007-06-20 14:31 /dev/hello
This is what happens with the default udev rule, which says that when a miscellaneous device appears, create a file named /dev/ and give it permissions 0660 (owner and group have read-write access, everyone else has no access). We would really like instead for the device be readable by regular users and have a link to it named /dev/hello_world. In order to do this, we'll write a udev rule.

The udev rule has to do two things: create a symbolic link and change the permissions on device to make world readable. The rule that accomplishes this is:

KERNEL=="hello", SYMLINK+="hello_world", MODE="0444"
We'll break the rule down into parts and explain each part.

KERNEL=="hello" says to execute the rest of the rule when a device with a name the same as this string (the == operator means "comparison") appears in /sys. The hello device appeared when we called misc_register() with a structure containing the device name "hello". See the result for yourself in /sys:

$ ls -d /sys/class/misc/hello/
SYMLINK+="hello_world" says to add (the += operator means append) hello_world to the list of symbolic links that should be created when the device appears. In our case, we know this is the only symbolic link in the list, but other devices may have multiple udev rules that create multiple different symbolic links, so it is good practice add to the list instead of assigning to it.

MODE="0444" says to set the permissions of the original device file to the 0444 mode, which allows owner, group, and world all to read the file.

In general, it is very important to use the correct operator (==, +=, or =), or unexpected things will happen.

Now that we understand what the rule does, let's install it in the /etc/udev directory. Udev rules files are arranged in much the same manner as the System V init scripts in /etc/init.d/. Udev executes every script the udev rules directory, /etc/udev/rules.d, in alphabetical/numerical order. Like System V init scripts, the files in the /etc/udev/rules.d directory are usually symbolic links to the real rules files, with the symbolic links named so that the rules will be executed in the correct order.

Copy the hello.rules file from the hello_dev directory into the /etc/udev/ directory and create a link to it that will be executed before any other rules file:

$ sudo cp hello.rules /etc/udev/
$ sudo ln -s ../hello.rules /etc/udev/rules.d/010_hello.rules
Now, reload the hello world driver and look at the new /dev entries:

$ sudo rmmod hello_dev
$ sudo insmod ./hello_dev.ko
$ ls -l /dev/hello*
cr--r--r-- 1 root root 10, 61 2007-06-19 21:21 /dev/hello
lrwxrwxrwx 1 root root 5 2007-06-19 21:21 /dev/hello_world -> hello
Now we have /dev/hello_world! Finally, check that you can read the "Hello, world!" devices as a normal user:

$ cat /dev/hello_world
Hello, world!
$ cat /dev/hello
Hello, world!

Worker Thread Execution in SMP System

Anil123's picture


In SMP system if we create work_queue thread is created on every processor. Can these excute in parallel or at a atime at most one of the worker
thread will be scheduled in SMP system.


Could this be updated for the 2.6.20 kernel?

chuck's picture

This is a nice introduction. It would be nice if it was updated for the 2.6.20 kernel.

Thanks for the wonderful information

kasiviswanathan's picture

Love has done a great job here by explaining almost the whole workqueue concept here.

Thanks very much! I've been

Anonymous's picture

Thanks very much! I've been looking for an ability to read registers of some I2C in ISR() or in some bottom half!
Firstly I've been surprised to see my ISR()/tasklet corrupt the kernel, but then I've saw that I2C functions performs SLEEP! :)
And I thought that I need some bottom half that able to perform I2C bus read.
Finally by the help of this article, I've solved my problems.


Priority of custom threads?

Anonymous's picture

when we create workqueues using custom threads, how can we make them run at a higher priority. Doesnt the workqueue have an interface to increase the priority of the worker thread(s) ?