Remote Debugging of Loadable Kernel Modules with kgdb: a Knowledge-based Article for Getting Started

Lamphere describes a straightforward technique that allows kernel debugging in the safety of user space.

As many kernel developers and hackers have known for years, loadable/unloadable kernel modules (like user-space applications) are almost never bug-free. With the continuing use and development of loadable modules growing, due in fact to the obvious benefits of the mechanism (lean kernels, reduction of kernel recompiles/reboots, etc.), developers are in an increasing need for robust debugging tools capable of aiding in the identification of problem code. Traditionally, module developers have used various debugging techniques to help identify problematic code. These techniques have included:

  • printk statements around suspected areas of failure (probably the most useful)

  • Oops analysis (also quite useful)

  • Magic Key combinations (for recovery of system hangs, displaying register contents, etc.)

While these methods are relatively useful, they may not be dynamic enough for pinpointing module failure/problems in all situations (consider tricky device driver resource allocation/deallocation, file operations, etc.). In fact, there may be many instances where coders benefit from the ability to perform ordinary application-style debugging on kernel modules. But unfortunately it is not inherently possible to single-step kernel code as in an ordinary application. However, there does exist a tool to help developers in obtaining this functionality. That tool is kgdb.

What Is kgdb?

kgdb is a kernel patch that, once applied, allows for the use of the familiar gdb interface for source-level debugging of a running Linux kernel. The process requires the use of two machines. One machine runs the kernel being debugged while the other runs the gdb session. Communication between the running kernel and gdb transpires via a serial cable connecting the two machines.

The kgdb patch supplies the kernel with a debugging stub. This stub uses the gdb remote serial protocol to communicate with gdb through a serial driver interface (also supplied by the patch). This patch is applied to the kernel on the machine that will run the gdb session (the development machine) where it is recompiled. The newly compiled boot image is then copied to the other machine (the target machine) where it is configured as the bootable kernel. When a reboot into the transferred kernel is complete, the target machine can then be configured to halt and await a remote connection from a gdb session on the development machine. When this connection is established, the target machine's kernel can then be debugged (single-stepping, issuing of breakpoints, data examination, etc.) through gdb on the development machine as if it were a user-space application.

Configuring kgdb

The first step is to download the kgdb patch for your kernel version. A patch can be obtained at http://kgdb.sourceforge.net/. As of this writing, patches only exist for the following kernels:

  • 2.4.0-test9

  • 2.4.0-test4 (kernel used for this article)

  • 2.4.0-test1

  • 2.3.99-pre6

  • 2.2.17

  • 2.2.12

Once you have obtained the patch, copy the patch to the kernel source directory on the development machine, and apply.

patch  -p1 < patchfile
(remember, this is the kernel that will eventually turn on the target machine)

The patched kernel must now be recompiled. It is assumed here that /usr/src/linux/.config exists and accurately reflects your current kernel configuration. Navigate to the source directory (if you aren't there already), and do a make menuconfig. From the main menu navigate to and select kernel hacking. You should now see an option for Remote (serial) debugging with gdb. Make sure this option is selected and then exit, saving your configuration. Next, do a make clean followed by a make bzImage (or whatever image you usually make).

The recompile adds a documentation file called gdb-serial.txt to your system. This file can be found in /usr/src/linux/Documentation/i386 and includes a step-by-step description of what needs to transpire next. Basically, here are the highlights.

The newly compiled kernel image (e.g., bzImage) is copied to the target machine where it is configured for boot. For example, the image may be copied to /boot/vmlinuz-target (or whatever you want to call it) followed by an added entry in lilo.conf:

image= /boot/vmlinuz-target
     label=target_kernel
     read-only
     root=/dev/hda1

Next, run LILO at the command line, and reboot into the new kernel.

  • On the development machine, navigate to /usr/src/linux/arch/i386/kernel. Here you will find an executable called gdbstart. Copy this program to the target machine. gdbstart is responsible for configuring the target machine's serial port (from user space) for communication with gdb on the development machine. The program then calls a process ioctl that activates the serial driver interface to the debugging stub. This driver effectively halts the target system until gdb on the development machine issues a continuance to resume execution.

  • Next, decide which serial port (i.e., ttyS0 or ttyS1) is to be used as well as a baud rate for communication (e.g., /dev/ttyS0 with a baud rate of 38,400).

  • Connect the two machines with a null modem serial cable. Be sure to connect the cable to the serial ports you have designated in the above step.

  • Run the gdbstart program on the target machine with the following parameters (or whatever port and data rate you decide upon):

gdbstart  - s 38400 - t /dev/ttyS0
The program will execute and pause, awaiting a remote connection from the development machine.

Alternatively, the documentation suggests creating a script on the target machine to deliberately call gdbstart with user-defined parameters.

  • The documentation next instructs you to create a .gdbinit file in /usr/src/linux on the development machine. Included in this file is a macro (called rmt) that is used to supply gdb with the information it needs to initiate the remote protocol. Edit this information to reflect the com port and data rate you have designated for communication between gdb and the target machine.

  • Now, navigate to /usr/src/linux on the development machine, and run gdb vmlinux. Once you receive a gdb prompt enter rmt, which informs gdb that it is connecting to a remote target (via the serial port and data rate specified in the .gdbinit file).

  • You should now see something that resembles Listing 1.

Listing 1. gdb Connecting to a Remote Target

You can now issue step commands, set breakpoints, etc. Issuing a continue to gdb will return the target kernel to a running state. The kernel will continue to run until it encounters a defined breakpoint, an interrupt, a signal, a segment violation, etc., at which point control is returned to gdb on the development machine.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix