Modifying a Dynamic Library Without Changing the Source Code

Software

by Greg Kroah-Hartman

on November 2, 2004

Sometimes, you might want to determine what is happening in a shared library without modifying the library (have you tried to build glibc lately?). Other times, you might want to override only a few functions within a library and get them to do something else--force a process to a specific CPU, prevent a specific USB message from being sent and so on. All of these tasks are possible if you use the LD_PRELOAD environment variable and a small shim program placed between the application and the library.

As an example, say you create a shared library object called shim.so and want it to be loaded before any other shared library. Say you also want to run the program "test". These things can be done on the command line with LD_PRELOAD:


	LD_PRELOAD=/home/greg/test/shim.so ./test

This command tells glibc to load the library shim.so from the directory /home/greg/test first, before it loads any other shared libraries the test program might need.

It is quite easy to create a shim library, thanks to some help from an old e-mail sent by the kernel developer Anton Blanchard. His e-mail provides an example of how to do exactly this.

As an example, let us log some information about how the libusb library is being called by another program. libusb is a library used by programs to access USB devices from userspace, eliminating the need to create a kernel driver for all USB devices.
Starting with the usb_open() function provided by libusb, the following small bit of code is used to log all accesses to this function:
usb_dev_handle *usb_open(struct usb_device *dev)
{
  static usb_dev_handle *(*libusb_open)
             (struct usb_device *dev) = NULL;
  void *handle;
  usb_dev_handle *usb_handle;
  char *error;

  if (!libusb_open) {
    handle = dlopen("/usr/lib/libusb.so",
                    RTLD_LAZY);
    if (!handle) {
      fputs(dlerror(), stderr);
      exit(1);
    }
    libusb_open = dlsym(handle, "usb_open");
    if ((error = dlerror()) != NULL) {
      fprintf(stderr, "%s\n", error);
      exit(1);
    }
  }

  printf("calling usb_open(%s)\n", dev->filename);
  usb_handle = libusb_open(dev);
  printf("usb_open() returned %p\n", usb_handle);
  return usb_handle;
}

To compile this code, run GCC with the following options:


 gcc -Wall -O2 -fpic -shared -ldl -o shim.so shim.c

This command creates a shared library called shim.so from the shim.c source code file. Then, if the test program was run with the previous LD_PRELOAD line, any calls to the function usb_open() in the test function call our library function instead. The code within our function tries to load the real libusb library with a call to the function dlopen:


    handle = dlopen("/usr/lib/libusb.so",
                    RTLD_LAZY);

The option RTLD_LAZY is passed here, because I do not want the loader to resolve all symbols at this point in time. I want them resolved only when the shim code asks for them to be resolved.

If that function succeeds, the code then asks for a pointer to the real usb_open function with a call to dlsym:


    libusb_open = dlsym(handle, "usb_open");

If that function succeeds, the shim now has a pointer to the real libusb function. It can call the real function whenever it wants to, after logging some information to the screen:


  printf("calling usb_open(%p)\n", dev);
  usb_handle = libusb_open(dev);

This also allows the code to do something after the library function has been called and before control is returned to the original program.

An example of running a program with this shim in place might produce the following output:


 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061120
 calling usb_open(002)
 usb_open() returned 0x8061120

To log a more complex function, such as usb_control_message, the same thing needs to be done as was done for usb_open:


int usb_control_msg(usb_dev_handle *dev, 
                    int requesttype,
                    int request,
                    int value,
                    int index,
                    char *bytes,
                    int size,
                    int timeout)
{
  static int(*libusb_control_msg)
            (usb_dev_handle *dev,
             int requesttype, int request,
             int value, int index, char *bytes,
             int size, int timeout) = NULL;
  void *handle;
  int ret, i;
  char *error;

  if (!libusb_control_msg) {
    handle = dlopen("/usr/lib/libusb.so", RTLD_LAZY);
    if (!handle) {
      fputs(dlerror(), stderr);
      exit(1);
    }
    libusb_control_msg = dlsym(handle, "usb_control_msg");
    if ((error = dlerror()) != NULL) {
      fprintf(stderr, "%s\n", error);
      exit(1);
    }
  }

  printf("calling usb_control_msg(%p, %04x, "
         "%04x, %04x, %04x, %p, %d, %d)\n"
         "\tbytes = ", dev, requesttype, 
         request, value, index, bytes, size,
         timeout);
  for (i = 0; i < size; ++i)
    printf ("%02x ", (unsigned char)bytes[i]);
  printf("\n");

  ret = libusb_control_msg(dev, requesttype, 
                           request, value, 
                           index, bytes, size,
                           timeout);

  printf("usb_control_msg(%p) returned %d\n"
         "\tbytes = ", dev, ret);
  for (i = 0; i < size; ++i)
    printf ("%02x ", (unsigned char)bytes[i]);
  printf("\n");

  return ret;
}

Running the test program again with the shim library loaded produces the following output:


usb_open() returned 0x8061100
calling usb_control_msg(0x8061100, 00c0, 0013, 6c7e, c41b, 0x80610a8, 8, 1000)
        bytes = c9 ea e7 73 2a 36 a6 7b 
usb_control_msg(0x8061100) returned 8
        bytes = 81 93 1a c4 85 27 a0 73 
calling usb_open(002)
usb_open() returned 0x8061120
calling usb_control_msg(0x8061120, 00c0, 0017, 9381, c413, 0x8061100, 8, 1000)
        bytes = 39 83 1d cc 85 27 a0 73 
usb_control_msg(0x8061120) returned 8
        bytes = 35 15 51 2e 26 52 93 43 
calling usb_open(002)
usb_open() returned 0x8061120
calling usb_control_msg(0x8061120, 00c0, 0017, 9389, c413, 0x8061108, 8, 1000)
        bytes = 80 92 1b c6 e3 a3 fa 9d 
usb_control_msg(0x8061120) returned 8
        bytes = 80 92 1b c6 e3 a3 fa 9d

Using the LD_PRELOAD environment variable, it is easy to place your own code between a program and the libraries it is linked against.

Greg Kroah-Hartman currently is the Linux kernel maintainer for a variety of different driver subsystems. He works for IBM, doing Linux kernel-related things, and can be reached at greg@kroah.com.

Load Disqus comments