Remote Debugging of Loadable Kernel Modules with kgdb: a Knowledge-based Article for Getting Started

Lamphere describes a straightforward technique that allows kernel debugging in the safety of user space.

As many kernel developers and hackers have known for years, loadable/unloadable kernel modules (like user-space applications) are almost never bug-free. With the continuing use and development of loadable modules growing, due in fact to the obvious benefits of the mechanism (lean kernels, reduction of kernel recompiles/reboots, etc.), developers are in an increasing need for robust debugging tools capable of aiding in the identification of problem code. Traditionally, module developers have used various debugging techniques to help identify problematic code. These techniques have included:

  • printk statements around suspected areas of failure (probably the most useful)

  • Oops analysis (also quite useful)

  • Magic Key combinations (for recovery of system hangs, displaying register contents, etc.)

While these methods are relatively useful, they may not be dynamic enough for pinpointing module failure/problems in all situations (consider tricky device driver resource allocation/deallocation, file operations, etc.). In fact, there may be many instances where coders benefit from the ability to perform ordinary application-style debugging on kernel modules. But unfortunately it is not inherently possible to single-step kernel code as in an ordinary application. However, there does exist a tool to help developers in obtaining this functionality. That tool is kgdb.

What Is kgdb?

kgdb is a kernel patch that, once applied, allows for the use of the familiar gdb interface for source-level debugging of a running Linux kernel. The process requires the use of two machines. One machine runs the kernel being debugged while the other runs the gdb session. Communication between the running kernel and gdb transpires via a serial cable connecting the two machines.

The kgdb patch supplies the kernel with a debugging stub. This stub uses the gdb remote serial protocol to communicate with gdb through a serial driver interface (also supplied by the patch). This patch is applied to the kernel on the machine that will run the gdb session (the development machine) where it is recompiled. The newly compiled boot image is then copied to the other machine (the target machine) where it is configured as the bootable kernel. When a reboot into the transferred kernel is complete, the target machine can then be configured to halt and await a remote connection from a gdb session on the development machine. When this connection is established, the target machine's kernel can then be debugged (single-stepping, issuing of breakpoints, data examination, etc.) through gdb on the development machine as if it were a user-space application.

Configuring kgdb

The first step is to download the kgdb patch for your kernel version. A patch can be obtained at http://kgdb.sourceforge.net/. As of this writing, patches only exist for the following kernels:

  • 2.4.0-test9

  • 2.4.0-test4 (kernel used for this article)

  • 2.4.0-test1

  • 2.3.99-pre6

  • 2.2.17

  • 2.2.12

Once you have obtained the patch, copy the patch to the kernel source directory on the development machine, and apply.

patch  -p1 < patchfile
(remember, this is the kernel that will eventually turn on the target machine)

The patched kernel must now be recompiled. It is assumed here that /usr/src/linux/.config exists and accurately reflects your current kernel configuration. Navigate to the source directory (if you aren't there already), and do a make menuconfig. From the main menu navigate to and select kernel hacking. You should now see an option for Remote (serial) debugging with gdb. Make sure this option is selected and then exit, saving your configuration. Next, do a make clean followed by a make bzImage (or whatever image you usually make).

The recompile adds a documentation file called gdb-serial.txt to your system. This file can be found in /usr/src/linux/Documentation/i386 and includes a step-by-step description of what needs to transpire next. Basically, here are the highlights.

The newly compiled kernel image (e.g., bzImage) is copied to the target machine where it is configured for boot. For example, the image may be copied to /boot/vmlinuz-target (or whatever you want to call it) followed by an added entry in lilo.conf:

image= /boot/vmlinuz-target
     label=target_kernel
     read-only
     root=/dev/hda1

Next, run LILO at the command line, and reboot into the new kernel.

  • On the development machine, navigate to /usr/src/linux/arch/i386/kernel. Here you will find an executable called gdbstart. Copy this program to the target machine. gdbstart is responsible for configuring the target machine's serial port (from user space) for communication with gdb on the development machine. The program then calls a process ioctl that activates the serial driver interface to the debugging stub. This driver effectively halts the target system until gdb on the development machine issues a continuance to resume execution.

  • Next, decide which serial port (i.e., ttyS0 or ttyS1) is to be used as well as a baud rate for communication (e.g., /dev/ttyS0 with a baud rate of 38,400).

  • Connect the two machines with a null modem serial cable. Be sure to connect the cable to the serial ports you have designated in the above step.

  • Run the gdbstart program on the target machine with the following parameters (or whatever port and data rate you decide upon):

gdbstart  - s 38400 - t /dev/ttyS0
The program will execute and pause, awaiting a remote connection from the development machine.

Alternatively, the documentation suggests creating a script on the target machine to deliberately call gdbstart with user-defined parameters.

  • The documentation next instructs you to create a .gdbinit file in /usr/src/linux on the development machine. Included in this file is a macro (called rmt) that is used to supply gdb with the information it needs to initiate the remote protocol. Edit this information to reflect the com port and data rate you have designated for communication between gdb and the target machine.

  • Now, navigate to /usr/src/linux on the development machine, and run gdb vmlinux. Once you receive a gdb prompt enter rmt, which informs gdb that it is connecting to a remote target (via the serial port and data rate specified in the .gdbinit file).

  • You should now see something that resembles Listing 1.

Listing 1. gdb Connecting to a Remote Target

You can now issue step commands, set breakpoints, etc. Issuing a continue to gdb will return the target kernel to a running state. The kernel will continue to run until it encounters a defined breakpoint, an interrupt, a signal, a segment violation, etc., at which point control is returned to gdb on the development machine.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState