Remote Debugging of Loadable Kernel Modules with kgdb: a Knowledge-based Article for Getting Started

Lamphere describes a straightforward technique that allows kernel debugging in the safety of user space.
gdb Considerations

We are almost ready to begin debugging a module. But before we proceed, we have to alleviate another possible issue. In gdb we will be using the add-symbol-file to inform the debugger of the memory location for our module's object code on the target machine. However, gdb 5.0 and previous versions have had problems in correctly calculating addresses using the add-symbol-file command (the problem surrounds the issue of a module's global variables). The problem has been corrected in developmental versions of gdb. It is therefore recommended that you use a developmental version of gdb to debug modules with kgdb. The gdb version used for the remainder of this article is a developmental gdb built for Red Hat 6.2. For more information regarding this issue, visit http://kgdb.sourceforge.net/.

Final Preparations

The last step we must account for is the configuration of our module to communicate with the remote gdb session. This is a relatively simple task. We now know that when gdb on the development machine makes the initial contact with the target machine, the gdbserial interface issues a call to the debugging stub's set_debug_traps( ) function. This function, as you recall, instructs gdb to perform all exception handling for the target kernel. The serial interface then issues a call to the stub's breakpoint( ) function, which turns control over to gdb. At this point we can inform the remote kernel to resume normal operations by issuing a continue command from gdb.

With the target kernel now configured to return control to the remote gdb session whenever an exception is triggered, we can modify our module code to guarantee that such an event will arise. Consider the following modifications made to the simple module in Listing 6.

Listing 6. Modified Sample Module to Include Further Remote Control

As you can see by the code in Listing 6, we have added a breakpoint interrupt that is to be called (in this example) when the module is both loaded and unloaded from the kernel. These interrupts will return control to the remote gdb session, thus halting execution of the module at those points. Let's try it out.

  • On the development machine, recompile the module after adding the BREAKPOINT code:

gcc -c -O2 -g simple.c
  • Copy the newly compiled object code to the target machine.

  • Initiate a remote debug session on the target machine by running the gdbstart program.

  • On the development machine, navigate to /usr/src/linux and run gdb vmlinux. Remember to use a developmental version of the debugger as described in the previous section.

  • Once prompted in gdb, type rmt to initiate contact with the remote machine.

  • With the previously recorded hex address, use add-symbol-file to instruct gdb of the modules location in memory:

add-symbol-file /root/simple.o 0xc480004c

You may be wondering at this point how we can assign this address space when the module is not currently loaded. Or a better question may be, ``How do we know that the kernel will load the module into the exact same location in memory?'' We can make this assumption because the kernel very often will load the object code into the same memory segment as before. While this is not written in stone it does happen frequently enough to render this method quite reliable.

  • Return control to the remote kernel by issuing a continue command from gdb.

  • On the target machine, install the module:

insmod simple.o
When insmod invokes the init_module modules, our breakpoint interrupt is called and returns control to gdb. This allows us to step through the remainder of the init_module as if it were a user-space application as shown in Listing 7.

Listing 7. User Space gdb Dialogue Running Modified Simple Module

Note that this sequence reflects an example of stopping the module's load process. The module could be installed with the insmod -m parameter again to verify memory placement of the object code if other module functionality (other than the initialization process) was to be debugged (e.g., file operation functions: open, read, write, close, ioctl; driver resource allocation/deallocation, etc.).

  • Return control to the remote kernel from gdb (i.e., continue).

  • On the target machine, remove the simple module:

rmmod simple

This of course issues the cleanup_module function, which in turn invokes another of our breakpoints, returning control to gdb on the development machine:

(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0xc4800068 in cleanup_module () at simple.c:38
38      } // end function init_module
(gdb)
  • Return control to the remote kernel and stop the debug session.

Although this simple example does not actually accomplish much (except stepping around the printk function), it does illustrate how we can halt execution of the module for debugging via kgbd. One could only imagine the benefits of using such a method in lieu of traditional debugging methods (printks, Oops analysis, etc.) especially where very problematic module code is concerned. Of course no debugging method is perfect for all situations nor is it a replacement for writing good code. For instance, single-stepping around a device driver on a real-time system that depends on precise timing may not particularly make this method the best one for the job, but it does have useful applications and seems to have been well received by the kernel developer community.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState