Debugging Kernel Modules with User Mode Linux
So, let's make a bug—a nasty one. Let's say when someone opens device 4 (cat /dev/gentest4), the module hangs in a nasty loop: for(;;) i++; (see Listing 1). Deadlocks or hangs are common errors when writing programs. They are sometimes hard to find. Typically programmers just use printks to locate the errors: printk("Got here!\n");. This type of debugging works, but you still hang the system several times before you find the problem. With constant fscks, it can get ugly. But, with UML, you just add in the printks and reboot to a fresh filesystem every time to test it.
UML will help us find that bug with printks, but it is nothing that would have caused us more than a few reboots. Now let's make our first really nasty bug. Let's say that when someone reads from device 5 (i.e., cat /dev/gentest5); the module starts to overwrite all memory: memset(0, 0, 0xffffffff); (see Listing 2). Overwriting memory is a common error in C programs. In the kernel it is especially nasty and can sometimes cause an instant reboot, keeping you from seeing any printks that are generated. These bugs can still be isolated with printk, but it is a very time-consuming process.
From what I've covered so far, UML is a great debugging tool. You can use it to keep your filesystem safe when debugging modules. But there's something more: GDB.
As most experienced kernel programmers know, there is already a way to debug a kernel using GDB and the serial line. But, in my experience, it really doesn't work very well. The GDB shim in the kernel sometimes hangs, and you need two machines to make it work. I have successfully debugged kernels running in VMware on one machine by redirecting the virtual machine's serial port to a file, but it was slow going, since the kernel portion of the GDB code could still sometimes hang.
UML makes all that a thing of the past. With UML, you can run the entire virtual machine under GDB, attach to a kernel while it's running, or even after a panic. The easiest way to run UML under GDB is to add the command-line flag debug to your runline. UML will then spawn GDB in an xterm for you and stop the kernel. For most purposes, just type c to allow the kernel to continue booting up (see Figure 1).
To debug the module, you first have to load the module, then tell GDB where the symbol file is, then set any breakpoints you need.
So, first things first, load the module. Included in the source code is a simple shell script called loadModule that loads the module and creates the devices if they do not already exist.
Once the module is loaded, press Ctrl-C inside the GDB window to pause the kernel, and look at the module_list pointer. The last module loaded should be at the head of the list. You can use a simple printf command to get the address of the module. You'll need it when loading the symbol file (see Figure 2).
Now, load the symbols file with the command add-symbol-file MODULE_PATH ADDRESS. The filename used is the filename on the host system, not on the virtual machine. After answering “y” to an “Are you sure you know what you're doing?” question, the symbol file is loaded. You can check that it has been loaded correctly by re-examining the module_list pointer again. Notice that now the init and cleanup pointers have the appropriate function names associated with their addresses (see Figure 3).
Now that the module is loaded, you can set any breakpoints you want. I'll set a breakpoint at open and then try to cat one of the devices (see Figure 4).
Now, let's run our two tests and see how hard the bugs are to find when using GDB. On the first test, the system still hangs. But, now we can press Ctrl-C in the debugger and see where it is hung.
In the hang test (see Figure 5) it is obvious that the current stopping point is inside the for loop. If we really want to have fun, we can print out the value of i to see what it contains.
Now, the memory overwrite is a bit more difficult. Not because it is a panic, but because I used memset. memset, in the GNU libc, ends up inserting inline assembly into your code, so it looks like your bug is in string.h, instead of your module. But, it still lets you know which function the error occurred in, and you still know it is inside of a memset (see Figure 6).
Also, you still can examine any local variables in the current function (gRead) or any global variables to help you find the problem.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Sony Settles in Linux Battle
- Libarchive Security Flaw Discovered
- Profiles and RC Files
- Maru OS Brings Debian to Your Phone
- The Giant Zero, Part 0.x
- Snappy Moves to New Platforms
- Understanding Ceph and Its Place in the Market
- Git 2.9 Released
- Astronomy for KDE
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide