Debugging Kernel Modules with User Mode Linux
When you write programs in user space, the worst thing that can happen to your program is a core dump. Your program did something very wrong, so the operating system decided to give you all of its memory and state information back to you in the form of a core file. Core files can then be used to debug your program and fix the problem.
When you program in the kernel, there is no operating system to step in and safely stop your code from running and tell you that you have a problem. The Linux kernel is pretty nice to its own code. Sometimes it can survive a panic, if you are doing something wrong that is relatively benign (these panics are typically called oopses). But, there is nothing to stop your code from overwriting or accessing memory locations from anywhere in the kernel's address space. Also, if your module hangs, the kernel hangs (technically, your current kernel thread hangs, but the result is usually the same).
These problems may sound benign to the naïve, but they are serious issues. If the kernel panics, you rarely know exactly what caused the panic. The typical solution is to put printks everywhere and hope that you stumble across the problem before the messages are lost to the reboot. All of this is assuming that you do not corrupt your filesystem. I have lost an entire filesystem before due to a poorly timed panic (and due to the fact that a badly initialized pointer was overwriting some of ext2's internal structures).
The first thing you learn when kernel programming is to keep all your code on NFS. Files remain safe on another machine. But, that does not save you the time of having e2fsck run every time you panic. Plus, you still can lose your filesystem, even if your source code is safe on another machine.
So, with all of these issues, it is not surprising how few have entered the realm of kernel programming. Now, all that can change.
Back in the mainframe days, when timesharing machines were the norm, the idea of a virtual machine was born. A virtual machine is an encapsulated computer completely at your disposal. A program on a virtual machine has no real access to the physical hardware. All hardware access is controlled by the machine or emulator.
VMware (www.vmware.com) has a very powerful virtual machine that allows you to run any x86-based operating system under Windows NT, 2000, XP or Linux. SoftPC (an 8086 emulator allowing you to run Windows and DOS programs) has been available on Motorola 68k-based computers (i.e., the Macintosh) since 1988.
True virtual machines are sometimes too expensive for the learner's budget. (VMware Workstation for Linux costs $299 US from their web site.) Thankfully, there is now a free alternative for those only wanting to run Linux: User-Mode Linux (UML).
User-Mode Linux (user-mode-linux.sourceforge.net) is not a complete virtual machine. It does not emulate different hardware or give you the ability to run other operating systems. But, it does allow you to run a kernel in user space. This gives you several benefits when it comes to development: the host filesystem is safe from corruption, the virtual filesystem is undoable (which makes it safe from corruption), you can run multiple machines on one machine (this is useful for testing intermachine communication, i.e., network messages, without having to use multiple machines) and it is very easy to run the kernel in a debugger.
Running UML is easy. You can download one of the binary packages (kernel binaries, plus a couple of tools), or you can download the kernel patch. You also need to download a filesystem. I'd recommend playing with the binaries first, then building a custom kernel to suit your needs. The HOWTO covers all of these topics and more.
One useful benefit of UML is Copy-on-Write files. These files allow you to modify a virtual filesystem, without modifying the base filesystem. All writes or modifications to a filesystem are stored in these files, typically ending with the extension .cow.
So, when you are working, and you panic the filesystem, all you do is remove the .cow file (which will be recreated), and your corrupted filesystem is restored to its pristine version. (There are also tools to incorporate the changes in a .cow file back into the original filesystem, if you want to keep your changes.)
Once you have UML up and running, it's time to play. I've written a very simple kernel module for testing. It uses four devices, /dev/gentest[0-3]. The module treats each device a little differently. Device 1 is a sink (just like /dev/null). Device 2 stores a string for later retrieval. You can read the status of the module from device 3, and device 0 could be any of the other three devices, depending on how it is configured. (You can change the configuration with ioctl calls.) The kernel module is available from www.frascone.com/kHacking/gentest-0.1.tar.gz.
Webinar: 8 Signs You’re Beyond Cron
11am CDT, April 29th
Join Linux Journal and Pat Cameron, Director of Automation Technology at HelpSystems, as they discuss the eight primary advantages of moving beyond cron job scheduling. In this webinar, you’ll learn about integrating cron with an enterprise scheduler.Join us!
- March 2015 Issue of Linux Journal: High-Performance Computing
- Not So Dynamic Updates
- Users, Permissions and Multitenant Sites
- April 2015 Video Preview
- New Products
- Security in Three Ds: Detect, Decide and Deny
- Flexible Access Control with Squid Proxy
- DevOps: Everything You Need to Know
- Tighten Up SSH
- Non-Linux FOSS: MenuMeters