Quantcast
Username/Email:  Password: 

Modifying a Dynamic Library Without Changing the Source Code

Placing your own code between a program and the libraries it is linked against is easy when you use the LD_PRELOAD environment variable.


Sometimes, you might want to determine what is happening in a shared
library without modifying the library (have you tried to build
glibc lately?). Other times, you might want to override only a few
functions within a library and get them to do something else--force
a process to a specific CPU, prevent a specific USB message from being
sent and so on. All of these tasks are possible if you use the LD_PRELOAD
environment variable and a small shim program placed between the
application and the library.

As an example, say you create a shared library object
called shim.so and want it to be loaded
before any other shared library. Say you also want to
run the program "test". These things can be done on the
command line with LD_PRELOAD:


	LD_PRELOAD=/home/greg/test/shim.so ./test

This command tells glibc to load the library shim.so
from the directory /home/greg/test first, before it
loads any other shared libraries the test
program might need.

It is quite easy to create a shim library, thanks to
some help from an old e-mail sent by the kernel developer
Anton Blanchard. His e-mail provides an example of how to
do exactly this.

As an example, let us log some information about how
the libusb library is being called by another
program. libusb is a library used by
programs to access USB devices from userspace,
eliminating the need to create a kernel driver for
all USB devices.

Starting with the usb_open() function provided by
libusb, the following small bit of code is used
to log all accesses to this function:


usb_dev_handle *usb_open(struct usb_device *dev)
{
  static usb_dev_handle *(*libusb_open)
             (struct usb_device *dev) = NULL;
  void *handle;
  usb_dev_handle *usb_handle;
  char *error;

  if (!libusb_open) {
    handle = dlopen("/usr/lib/libusb.so",
                    RTLD_LAZY);
    if (!handle) {
      fputs(dlerror(), stderr);
      exit(1);
    }
    libusb_open = dlsym(handle, "usb_open");
    if ((error = dlerror()) != NULL) {
      fprintf(stderr, "%s\n", error);
      exit(1);
    }
  }

  printf("calling usb_open(%s)\n", dev->filename);
  usb_handle = libusb_open(dev);
  printf("usb_open() returned %p\n", usb_handle);
  return usb_handle;
}

To compile this code, run GCC with the following
options:


 gcc -Wall -O2 -fpic -shared -ldl -o shim.so shim.c

This command creates a shared library called shim.so from
the shim.c source code file. Then, if the test program was
run with the previous LD_PRELOAD line, any calls to the function
usb_open() in the test function call our
library function instead. The code within our
function tries to load the real libusb library
with a call to the function dlopen:


    handle = dlopen("/usr/lib/libusb.so",
                    RTLD_LAZY);

The option RTLD_LAZY is passed here, because I do not want
the loader to resolve all symbols at this point in
time. I want them resolved only when the shim code asks for
them to be resolved.

If that function succeeds, the code then asks for a
pointer to the real usb_open function with a call to
dlsym:


    libusb_open = dlsym(handle, "usb_open");

If that function succeeds, the shim now has a pointer to
the real libusb function. It can call the real function whenever
it wants to, after logging some information to the
screen:


  printf("calling usb_open(%p)\n", dev);
  usb_handle = libusb_open(dev);

This also allows the code to do something
after the library function has been called and
before control is returned to the original program.

An example of running a program with this shim in
place might produce the following output:


 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061100
 calling usb_open(002)
 usb_open() returned 0x8061120
 calling usb_open(002)
 usb_open() returned 0x8061120

To log a more complex function, such as
usb_control_message, the same thing needs to be
done as was done for usb_open:


int usb_control_msg(usb_dev_handle *dev, 
                    int requesttype,
                    int request,
                    int value,
                    int index,
                    char *bytes,
                    int size,
                    int timeout)
{
  static int(*libusb_control_msg)
            (usb_dev_handle *dev,
             int requesttype, int request,
             int value, int index, char *bytes,
             int size, int timeout) = NULL;
  void *handle;
  int ret, i;
  char *error;

  if (!libusb_control_msg) {
    handle = dlopen("/usr/lib/libusb.so", RTLD_LAZY);
    if (!handle) {
      fputs(dlerror(), stderr);
      exit(1);
    }
    libusb_control_msg = dlsym(handle, "usb_control_msg");
    if ((error = dlerror()) != NULL) {
      fprintf(stderr, "%s\n", error);
      exit(1);
    }
  }

  printf("calling usb_control_msg(%p, %04x, "
         "%04x, %04x, %04x, %p, %d, %d)\n"
         "\tbytes = ", dev, requesttype, 
         request, value, index, bytes, size,
         timeout);
  for (i = 0; i < size; ++i)
    printf ("%02x ", (unsigned char)bytes[i]);
  printf("\n");

  ret = libusb_control_msg(dev, requesttype, 
                           request, value, 
                           index, bytes, size,
                           timeout);

  printf("usb_control_msg(%p) returned %d\n"
         "\tbytes = ", dev, ret);
  for (i = 0; i < size; ++i)
    printf ("%02x ", (unsigned char)bytes[i]);
  printf("\n");

  return ret;
}

Running the test program again with the shim
library loaded produces the following output:


usb_open() returned 0x8061100
calling usb_control_msg(0x8061100, 00c0, 0013, 6c7e, c41b, 0x80610a8, 8, 1000)
        bytes = c9 ea e7 73 2a 36 a6 7b 
usb_control_msg(0x8061100) returned 8
        bytes = 81 93 1a c4 85 27 a0 73 
calling usb_open(002)
usb_open() returned 0x8061120
calling usb_control_msg(0x8061120, 00c0, 0017, 9381, c413, 0x8061100, 8, 1000)
        bytes = 39 83 1d cc 85 27 a0 73 
usb_control_msg(0x8061120) returned 8
        bytes = 35 15 51 2e 26 52 93 43 
calling usb_open(002)
usb_open() returned 0x8061120
calling usb_control_msg(0x8061120, 00c0, 0017, 9389, c413, 0x8061108, 8, 1000)
        bytes = 80 92 1b c6 e3 a3 fa 9d 
usb_control_msg(0x8061120) returned 8
        bytes = 80 92 1b c6 e3 a3 fa 9d 

Using the LD_PRELOAD environment variable, it is
easy to place your own code between a program
and the libraries it is linked against.

Greg Kroah-Hartman currently is the Linux kernel maintainer for a
variety of different driver subsystems. He works
for IBM, doing Linux kernel-related things, and can
be reached at greg@kroah.com.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Can you post that code

Flowers's picture

Can you post that code perchance?

I have found interesting

Anonymous's picture

I have found interesting sources and would like to give the benefit of my experience to you.
I am tuning my pc by the best software for free, with the file search engine DornFall
May be you have your own experience and could give some useful sites too. Because this social site help me much.

A little thing you forgot

grib's picture

You wrote:
This command tells glibc...

Hmm, lets see:

$ man ld.so
NAME
ld.so/ld-linux.so - dynamic linker/loader
...
ENVIRONMENT
...
LD_PRELOAD
A whitespace-separated list of additional, user-specified, ELF
shared libraries to be loaded before all others. This can be
used to selectively override functions in other shared
libraries. For setuid/setgid ELF binaries, only libraries in
the standard search directories that are also setgid will be
loaded.
...
$

So, LD_PRELOAD affects the dynamic linker rather than glibc.

It's possible to override

promag's picture

It's possible to override dlopen and dlsym?

Yes

i3839's picture

Yes, it is, but you need to do extra work if you want to call the real ones yourself (there are special linking options which can ahcieve this).

poor man's AOP ?

kevin bedell's picture

This strikes me as describing a fundamental way to implement AOP on theop of the Linux Kernel.

Anyone heaerd of people taking this idea further and actually trying to build an AOP implementation?

re: force a process to a specific CPU

Anonymous's picture

Ahh, the teaser promises binding to a single CPU -- can you post that code perchance?

Thanks!

Nice example. A trick that be

bildrulle's picture

Nice example. A trick that be usefull for many things. But this particular example could just as easily have been achived by running the ltrace command.

Re: Nice example. A trick that be

Anonymous's picture

Ltrace is nice, but the LD_PRELOAD shim can do more things with the shimmed function(s). For example, it might only print out the trace message when certain conditions are met in the parameters. Or, every call to the shimmed function could scan target library internal data structures for corruption. Etc.

thats one of the problems wit

gfdsa's picture

thats one of the problems with open source today, it became "too much", for every thing you want to do, there are a few ways

no problem :)

Anonymous's picture

It's not a problem, to have a lot of options, it's flexibility and last not least, it's freedom.

It's an illusion of freedom

Anonymous's picture

It's an illusion of freedom

I know you

Anonymous's picture

Osama, is that you?

Post new comment

  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.
  • Use to create page breaks.

More information about formatting options