Kbuild: the Linux Kernel Build System
One amazing thing about Linux is that the same code base is used for a different range of computing systems, from supercomputers to very tiny embedded devices. If you stop for a second and think about it, Linux is probably the only OS that has a unified code base. For example, Microsoft and Apple use different kernels for their desktop and mobile OS versions (Windows NT/Windows CE and OS X/iOS). Two of the reasons this is possible on Linux are that the kernel has many abstraction layers and levels of indirection and because its build system allows for creating highly customized kernel binary images.
The Linux kernel has a monolithic architecture, which means that the whole kernel code runs in kernel space and shares the same address space. Because of this architecture, you have to choose the features your kernel will include at compile time. Technically, Linux is not a pure monolithic kernel, because it can be extended at runtime using loadable kernel modules. To load a module, the kernel must contain all the kernel symbols used in the module. If those symbols were not included in the kernel at compile time, the module will not be loaded due to missing dependencies. Modules are only a way to defer compilation (or execution) of a specific kernel feature. Once a kernel module is loaded, it is part of the monolithic kernel and shares the same address space of the code that was included at kernel compile time. Even when Linux supports modules, you still need to choose at kernel compile time most of the features that will be built in the kernel image and the ones that will allow you to load specific kernel modules once the kernel is executing.
For this reason, it is very important to be able to choose what code you want to compile (or not) in a Linux kernel. The approach for achieving this is using conditional compilation. There are tons of configuration options for choosing whether a specific feature will be included. This is translated to deciding whether a specific C file, code segment or data structure will be included in the kernel image and its modules.
So, an easy and efficient way to manage all these compilation options is needed. The infrastructure to manage this—building the kernel image and its modules—is known as the Kernel Build System (kbuild).
I don't explain the kbuild infrastructure in too much detail here, because the Linux kernel documentation provides a good explanation (Documentation/kbuild). Instead, I discuss the kbuild basics and show how to use it to include your own code in a Linux kernel tree, such as a device driver.
The Linux Kernel Build System has four main components:
Config symbols: compilation options that can be used to compile code conditionally in source files and to decide which objects to include in a kernel image or its modules.
Kconfig files: define each config symbol and its attributes, such as its type, description and dependencies. Programs that generate an option menu tree (for example,
make menuconfig) read the menu entries from these files.
.config file: stores each config symbol's selected value. You can edit this file manually or use one of the many
makeconfiguration targets, such as menuconfig and xconfig, that call specialized programs to build a tree-like menu and automatically update (and create) the .config file for you.
Makefiles: normal GNU makefiles that describe the relationship between source files and the commands needed to generate each make target, such as kernel images and modules.
Now, let's look at each of these components in more detail.
Compilation Options: Configuration Symbols
Configuration symbols are the ones used to decide which features will be included in the final Linux kernel image. Two kinds of symbols are used for conditional compilation: boolean and tristate. They differ only in the number of values that each one can take. But, this difference is more important than it seems. Boolean symbols (not surprisingly) can take one of two values: true or false. Tristate symbols, on the other hand, can take three different values: yes, no or module.
Not everything in the kernel can be compiled as a module. Many features are so intrusive that you have to decide at compilation time whether the kernel will support them. For example, you can't add Symmetric Multi-Processing (SMP) or kernel preemption support to a running kernel. So, using a boolean config symbol makes sense for those kinds of features. Most features that can be compiled as modules also can be added to a kernel at compile time. That's the reason tristate symbols exist—to decide whether you want to compile a feature built-in (y), as a module (m) or not at all (n).
There are other config symbol types besides these two symbols, such as strings and hex. But, because they are not used for conditional compilation, I don't cover those here. Read the Linux kernel documentation for a complete discussion of config symbols, types and uses.
Defining Configuration Symbols: Kconfig Files
Configuration symbols are defined in files known as Kconfig files. Each
Kconfig file can describe an arbitrary number of symbols and can
also include (source) other Kconfig files. Compilation targets that
construct configuration menus of kernel compile options, such as
menuconfig, read these files to build the tree-like structure. Every
directory in the kernel has one Kconfig that includes the Kconfig files
of its subdirectories. On top of the kernel source code directory, there is
a Kconfig file that is the root of the options tree. The menuconfig
(scripts/kconfig/mconf), gconfig (scripts/kconfig/gconf) and other
compile targets invoke programs that start at this root Kconfig and
recursively read the Kconfig files located in each subdirectory to build their
menus. Which subdirectory to visit also is defined in each Kconfig file and
also depends on the config symbol values chosen by the user.
Storing Symbol Values: .config File
All config symbol values are saved in a special file called .config. Every time you want to change a kernel compile configuration, you execute a make target, such as menuconfig or xconfig. These read the Kconfig files to create the menus and update the config symbols' values using the values defined in the .config file. Additionally, these tools update the .config file with the new options you chose and also can generate one if it didn't exist before.
Because the .config file is plain text, you also can change it without needing any specialized tool. It is very convenient for saving and restoring previous kernel compilation configurations as well.
Compiling the Kernel: Makefiles
The last component of the kbuild system is the Makefiles. These are used to build the kernel image and modules. Like the Kconfig files, each subdirectory has a Makefile that compiles only the files in its directory. The whole build is done recursively—a top Makefile descends into its subdirectories and executes each subdirectory's Makefile to generate the binary objects for the files in that directory. Then, these objects are used to generate the modules and the Linux kernel image.
Putting It All Together: Adding the Coin Driver
Now that you know more about kbuild system basics, let's consider a practical example—adding a device driver to a Linux kernel tree. The example driver is for a very simple character device called coin. The driver's function is to mimic a coin flipping and returning on each read one of two values: head or tail. The driver has an optional feature that exposes previous flip statistics using a special debugfs virtual file. Listing 1 shows an example interaction with the coin device.
Listing 1. Coin Character Device Semantics
root@localhost:~# cat /dev/coin tail root@localhost:~# cat /dev/coin head root@sauron:/# cat /sys/kernel/debug/coin/stats head=6 tail=4
To add a feature to a Linux kernel (such as the coin driver), you need to do three things:
Put the source file(s) in a place that makes sense, such as drivers/net/wireless for Wi-Fi devices or fs for a new filesystem.
Update the Kconfig for the subdirectory (or subdirectories) where you put the files with config symbols that allow you to choose to include the feature.
Update the Makefile for the subdirectory where you put the files, so the build system can compile your code conditionally.
Because this driver is for a character device, put the coin.c source file in drivers/char.
The next step is to give the user the option to compile the coin driver. To do this, you need to add two configuration symbols to the drivers/char/Kconfig file: one to choose to add the driver to the kernel and a second to decide whether the driver statistics will be available.
Like most drivers, coin can be built in the kernel, included as a module
or not included at all. So, the first config symbol, called
COIN, is of
type tristate (y/n/m). The second symbol,
COIN_STAT, is used to decide
whether you want to expose the statistics. Clearly this is a binary
decision, so the symbol type is bool (y/n). Also, it doesn't make sense
to add the coin statistics to the kernel if you choose not to include
the coin driver itself. This behavior is very common in the kernel—for
example, you can't add a block-based filesystem, such as ext3 or fat32, if
you didn't enable the block layer first. Obviously, there is some kind of
dependency between symbols, and you should model this. Fortunately, you can
describe config symbols' relationships in Kconfig files using the depends
on keyword. When, for example, the
menuconfig target generates the
compilation options menu tree, it hides all the options whose symbol
dependencies are not met. This is just one of many keywords available
for describing symbols in a Kconfig file. For a complete description of the
Kconfig language, refer to kbuild/kconfig-language.txt in the Linux kernel
Javier Martinez Canillas is a longtime Linux user, administrator and open-source advocate developer. He has an MS from the Universitat Autònoma de Barcelona and works as a Linux kernel engineer.