Debugging Kernel Modules with User Mode Linux
So, let's make a bug—a nasty one. Let's say when someone opens device 4 (cat /dev/gentest4), the module hangs in a nasty loop: for(;;) i++; (see Listing 1). Deadlocks or hangs are common errors when writing programs. They are sometimes hard to find. Typically programmers just use printks to locate the errors: printk("Got here!\n");. This type of debugging works, but you still hang the system several times before you find the problem. With constant fscks, it can get ugly. But, with UML, you just add in the printks and reboot to a fresh filesystem every time to test it.
UML will help us find that bug with printks, but it is nothing that would have caused us more than a few reboots. Now let's make our first really nasty bug. Let's say that when someone reads from device 5 (i.e., cat /dev/gentest5); the module starts to overwrite all memory: memset(0, 0, 0xffffffff); (see Listing 2). Overwriting memory is a common error in C programs. In the kernel it is especially nasty and can sometimes cause an instant reboot, keeping you from seeing any printks that are generated. These bugs can still be isolated with printk, but it is a very time-consuming process.
From what I've covered so far, UML is a great debugging tool. You can use it to keep your filesystem safe when debugging modules. But there's something more: GDB.
As most experienced kernel programmers know, there is already a way to debug a kernel using GDB and the serial line. But, in my experience, it really doesn't work very well. The GDB shim in the kernel sometimes hangs, and you need two machines to make it work. I have successfully debugged kernels running in VMware on one machine by redirecting the virtual machine's serial port to a file, but it was slow going, since the kernel portion of the GDB code could still sometimes hang.
UML makes all that a thing of the past. With UML, you can run the entire virtual machine under GDB, attach to a kernel while it's running, or even after a panic. The easiest way to run UML under GDB is to add the command-line flag debug to your runline. UML will then spawn GDB in an xterm for you and stop the kernel. For most purposes, just type c to allow the kernel to continue booting up (see Figure 1).
To debug the module, you first have to load the module, then tell GDB where the symbol file is, then set any breakpoints you need.
So, first things first, load the module. Included in the source code is a simple shell script called loadModule that loads the module and creates the devices if they do not already exist.
Once the module is loaded, press Ctrl-C inside the GDB window to pause the kernel, and look at the module_list pointer. The last module loaded should be at the head of the list. You can use a simple printf command to get the address of the module. You'll need it when loading the symbol file (see Figure 2).
Now, load the symbols file with the command add-symbol-file MODULE_PATH ADDRESS. The filename used is the filename on the host system, not on the virtual machine. After answering “y” to an “Are you sure you know what you're doing?” question, the symbol file is loaded. You can check that it has been loaded correctly by re-examining the module_list pointer again. Notice that now the init and cleanup pointers have the appropriate function names associated with their addresses (see Figure 3).
Now that the module is loaded, you can set any breakpoints you want. I'll set a breakpoint at open and then try to cat one of the devices (see Figure 4).
Now, let's run our two tests and see how hard the bugs are to find when using GDB. On the first test, the system still hangs. But, now we can press Ctrl-C in the debugger and see where it is hung.
In the hang test (see Figure 5) it is obvious that the current stopping point is inside the for loop. If we really want to have fun, we can print out the value of i to see what it contains.
Now, the memory overwrite is a bit more difficult. Not because it is a panic, but because I used memset. memset, in the GNU libc, ends up inserting inline assembly into your code, so it looks like your bug is in string.h, instead of your module. But, it still lets you know which function the error occurred in, and you still know it is inside of a memset (see Figure 6).
Also, you still can examine any local variables in the current function (gRead) or any global variables to help you find the problem.
- Transitioning to Python 3
- Red Hat OpenStack Platform
- Stepping into Science
- Tech Tip: Really Simple HTTP Server with Python
- Linux Journal December 2016
- CORSAIR's Carbide Air 740
- The Tiny Internet Project, Part II
- Radio Free Linux
- A Better Raspberry Pi Streaming Solution
- FutureVault Inc.'s FutureVault
Pick up any e-commerce web or mobile app today, and you’ll be holding a mashup of interconnected applications and services from a variety of different providers. For instance, when you connect to Amazon’s e-commerce app, cookies, tags and pixels that are monitored by solutions like Exact Target, BazaarVoice, Bing, Shopzilla, Liveramp and Google Tag Manager track every action you take. You’re presented with special offers and coupons based on your viewing and buying patterns. If you find something you want for your birthday, a third party manages your wish list, which you can share through multiple social- media outlets or email to a friend. When you select something to buy, you find yourself presented with similar items as kind suggestions. And when you finally check out, you’re offered the ability to pay with promo codes, gifts cards, PayPal or a variety of credit cards.Get the Guide