Core Knowledge That Modern Linux Kernel Developer Should Have
The Linux Kernel is written in C programming language, so C is the most important language for the Linux Kernel developer. Initially, the kernel was written in GNU C (now it is also possible to build it using LLVM) which extends standard C with some additional keywords and attributes. I would recommend learning some modern C version like C11 and additionally learning GNU extensions to be able to read kernel code effectively. Small, architecture-specific parts of the kernel and some highly optimized parts of several drivers are written in assembly language. This is the second language of choice. There are 3 main architectures nowadays: x86, ARM, and RISC-V. What assembly language to choose depends on your hardware platform.
You definitely should look at Rust which is gaining popularity in the Linux Kernel community as a more safer and reliable alternativeto C.
Linux is a highly configurable system and its configurability is based on the kernel build system, KBuild. Each developer should know the basics of KBuild and Make to be able to successfully extend/modify the kernel code. Last, but not least is shell scripting. It is hard to imagine Kernel development without command-line usage and a developer inevitably has to write some shell scripts to support their job by automating repetitive tasks.Software environment
The Linux Kernel development is inextricably linked to the Git source control system. It is not possible to imagine nowadays the kernel development workflow without it. So, Git knowledge is a requirement.
Unless kernel developers run their kernel on specific/customized hardware - emulation is the best developer's friend. The most popular platform for this is Qemu/KVM. A typical workflow looks like this: a developer introduces some changes to the kernel or a driver, builds it, copies it under a virtual environment, and tests it there. If all is OK, then the developer tests these changes on real hardware, but if something goes wrong, then the kernel under the virtual machine crashes. In this case, it is quite easy to just shut down VM, fix the error and repeat the development/debug cycle. If we didn't have virtualization we would restart the real machine on each kernel crash and development time would increase in order of magnitude.
Unlike userspace, the kernel has limited debugging capabilities. Actually, the most popular method of kernel debugging was (and sometimes is) inserting
printk function calls, which store its output to the kernel's circular buffer, into the code in question and analyze this output in userspace using
dmesg -kw command. Since the kernel version 2.6 a new in-kernel
ftrace framework was introduced. It has been developing since then and now it is comprehensive and robust. It proposes a lot of ways of debugging and many output formats. The most popular function - tracing kernel stack traces for the whole kernel, or part of it, or specific modules and showing its output in a special file. It saves hours and hours of debugging for developers. Besides, it is zero-overhead while in the inactive state. Every modern kernel developer should be aware of
There are many cases when a developer encounters that his kernel module is simply slow. This is where
perf is a pair of an in-kernel profiling framework and a userspace tool helping analyze in-kernel performance. The most sophisticated and flexible tool for gathering kernel runtime information is
eBPF framework enables running user programs inside the kernel and passing information to the userspace. In fact, enabling user-defined kernel telemetry, this kernel framework revolutionizes kernel observability.
One of the main domain areas of kernel development is embedded development. Indeed, most of the embedded devices starting from IoT for smart homes to Android-base smartphones carry a flavor of the Linux Kernel on board. There are 2 main build systems in the embedded world: Buildroot and Yocto. The former is more simple and straightforward, the latter is more flexible and sophisticated. Both are intended to build highly customized Linux distro, that is, the Linux Kernel + set of userspace software, tailored to a particular hardware board. It is worth mentioning that an embedded developer must be aware and able to create/update dts-files describing a set of hardware components on the board. The main bootloader in the embedded world is
u-boot and its knowledge is also a requirement. Talking about userspace, one of the simplest and most well-known minimalistic frameworks is
busybox contains only the minimal set of necessary utilities. It is very small in size and therefore convenient for both embedded and emulated (Qemu/KVM) development.
Last but not least, the dev environment. Most of the Linux Kernel devs use vim (or qemu) text editor in the terminal, tmux as a modern and convenient terminal multiplexer and cscope for building cross reference for a kernel source code.Linux Kernel Core Concepts
The Linux Kernel development technical skills fall into 2 categories: general and domain-specific. General skills should be known by each kernel developer, while domain-specific ones by a developer work in this particular domain area, for example: networking, storage, virtualization, cryptography, embedded, etc... Linux Kernel is huge and it is impossible to know every part with the same level of details.
Let's start with general skills:
- Kernel coding style - The Linux Kernel has its own coding style that can slightly vary from one subsystem to another. It is always a good idea to periodically check your code with a special script within the kernel source code tree
- The Kernel coding patterns - The Kernel has a set of coding patterns recommended to use. The most well-known of those is allocating/deallocating resources during multi-step resource initialization using
- The Kernel internal data structures - There are several most important data structures in kernel used globally that every developer must be aware of. Those are: Singly and Doubly Linked Lists, Queues, Hashes, Binary Trees, Red-Black trees, Maple trees, and so on.
- Synchronization primitives - Back at the beginning of the 2000s, the first commodity SMP CPUs were introduced. Since then every kernel developer must write their code with multithreading in mind. The Linux Kernel has a lot of synchronization primitives, each for a different purpose: atomic operations, spin locks, semaphores, mutexes, RCUs (lockless algorithm class), etc.
- Interrupts handling: top and bottom halves - Linux Kernel has a unique interrupt handling scheme: top and bottom halves. The top half is intended for handling an interrupt as quickly as possible and return, while the bottom half is deferred work that further handles results delivered by the top half. For example: the top half copies a new packet from a network card to the main memory as fast as possible and returns awaiting a new one. A bottom half, which runs later as deferred work, examines the received packet and handles it: populate some fields, create appropriate data structures, and pass it to the Kernel's networking stack. Every developer must be aware of this interrupt handling scheme and design their interrupt handlers appropriately.
- Deferred work - A common situation in Linux Kernel development is postponing a part of the job for some moment in the future. Interrupts, mentioned in the previous point, is a good example. There are several deferred work mechanisms in the kernel for different situations: task queues, softirqs, tasklets, workqueues, etc.
- Memory management - Kernel developers should be aware of 2 layers of memory management: the lowest, native layer consists of functions kmalloc/kfree, and slab layer which is built atop of a native one and intended to store structures of different sizes in different caches to avoid memory fragmentation.
- Virtual File System - Regardless of the type of lower-layer filesystem (ext3, ext4, zfs, lustrefs, xfs, etc...), the kernel maintains a universal interface atop it. It is worth obtaining general knowledge of VFS, as filesystem interaction is one of the most popular communication methods between kernel and userspace.
- Scheduler - The scheduler manages all the processes in the operating system: kernel and userspace. The developer must know its basics.
- System call interface - The main way of communication between the kernel and userspace is the system call interface.
libclibrary in the userspace encapsulates it and provides more comprehensive and convenient functions for a developer, but sometimes it is needed to call a system directly. So both userspace and kernel programmers should know how to do this. In the very rare case, a kernel developer could want to add a new system call or trace its arguments using ftrace/strace. Useful knowledge.
/sys /proccatalogs - The second most popular way of interaction between the kernel and userspace is a filesystem. In particular, loads of information and settings are contained in
/procdirectories. It is worth learning those structures.
- Loadable Kernel Modules - One of the main occupations of a Linux Kernel Developer is developing drivers for new hardware devices. Linux drivers are made in the form of Loadable Kernel Modules, special format binaries created from source code made using a particular structure. These modules could be loaded/unloaded without a system reboot. The developer should know the structure of the kernel modules in general, and additional rules for character, block, and network devices. Also, they should be aware of the ways of communication with userspace: sysfs attributes, MMIO, kernel module parameters, and so on.
- Udev - Driver developers should be aware of the Udev subsystem that implements infrastructure supporting running user scripts when a device is hotplugged.
- Fault injection framework - Allows to test unusual code paths by injecting error results into typically, always-correct functions like memory allocation's
- Kernel Sanitizers - KASAN, KMSAN and so on. Dynamic tools catch some bad situations like memory corruption. It is always worth loading the new kernel module, running a workload, and trying to catch some subtle, dynamic bug.
- Locking correctness validator - Parts of the kernel/module code implement sophisticated locking schemes. This often leads to deadlocks and livelocks. It is hard to debug such a code.
Lockdepruntime validator catches such situations and saves hours of debugging efforts.
Kdump/Kexec- There are situations when it is almost impossible to debug code, especially if it relates to early, system boot time code. This is where Kdump/Kexec comes to help. It loads the second, crash kernel, which intercept crashed kernel and makes its dump for further analysis.
When it comes to specific, domain-related skills, the only right answer: it depends. Linux Kernel contains a lot of specialized frameworks. Just one example: It is worth learning I2C (SMB), SPI, and GPIO frameworks for embedded development.Userspace tools
Kernel developers should have some userspace knowledge and use common tools such as:
- bash (or alternative shell) - Building the kernel, scripting routine actions, etc.
- ssh (secure network shell) - Used to login and work on the target machine be it a remote network machine, VM, or embedded device.
- tmux (terminal multiplexer) - Support multi-terminal configuration. A convenient tool for kernel development. One window can show logs of the kernel build process, the second has a remote ssh shell, and the third has open vim editor.
- minicom - This is the main tool to work with embedded boards not equipped with ehternet/wifi/blueetoth modules. The Linux kernel can be configured to create UART-console during the early boot. In turn, a developer, having a 2-wire UART-TO-USB adapter can connect RX and TX wires to appropriate GPIO pins and insert USB adapter to their laptop, run minicom, and obtain connection.
- vim - Editor of choice for most of kernel developers. Besides, there are many servers that contain no GUI and Vim is the only choice there.
- gdb - Can be useful for debugging kernel OOPS errors. Having OOPS instruction address and loading uncompressed kernel images with debugging symbols (vmlinux) it is possible to walk the stack trace of the error analyzing the code.
- Passion - Kernel development is hard and thorough work. It is impossible to succeed in it without being inspired.
- Patience - This type of work requires patience in all senses. First, it is a hard work that implies careful code design and debugging, sometimes hours of debugging efforts to detect and fix a small bug. Second, the ever-changing nature of the kernel requires keeping yourself updated on the latest changes and being ready to update your code accordingly, especially code that was accepted to the mainline kernel. Third, working with the community is a hard and sometimes controversial process. Getting your idea across is not so easy.
- Persistency - Kernel developers should be persistent in constant learning and in communication with the community in case they want to get their code accepted to the mainline kernel.