Modifying a Dynamic Library Without Changing the Source Code
Sometimes, you might want to determine what is happening in a shared library without modifying the library (have you tried to build glibc lately?). Other times, you might want to override only a few functions within a library and get them to do something else--force a process to a specific CPU, prevent a specific USB message from being sent and so on. All of these tasks are possible if you use the LD_PRELOAD environment variable and a small shim program placed between the application and the library.
As an example, say you create a shared library object called shim.so and want it to be loaded before any other shared library. Say you also want to run the program "test". These things can be done on the command line with LD_PRELOAD:
LD_PRELOAD=/home/greg/test/shim.so ./test
This command tells glibc to load the library shim.so from the directory /home/greg/test first, before it loads any other shared libraries the test program might need.
It is quite easy to create a shim library, thanks to some help from an old e-mail sent by the kernel developer Anton Blanchard. His e-mail provides an example of how to do exactly this.
As an example, let us log some information about how the libusb library is being called by another program. libusb is a library used by programs to access USB devices from userspace, eliminating the need to create a kernel driver for all USB devices.
Starting with the usb_open() function provided by libusb, the following small bit of code is used to log all accesses to this function:
usb_dev_handle *usb_open(struct usb_device *dev) { static usb_dev_handle *(*libusb_open) (struct usb_device *dev) = NULL; void *handle; usb_dev_handle *usb_handle; char *error; if (!libusb_open) { handle = dlopen("/usr/lib/libusb.so", RTLD_LAZY); if (!handle) { fputs(dlerror(), stderr); exit(1); } libusb_open = dlsym(handle, "usb_open"); if ((error = dlerror()) != NULL) { fprintf(stderr, "%s\n", error); exit(1); } } printf("calling usb_open(%s)\n", dev->filename); usb_handle = libusb_open(dev); printf("usb_open() returned %p\n", usb_handle); return usb_handle; }
To compile this code, run GCC with the following options:
gcc -Wall -O2 -fpic -shared -ldl -o shim.so shim.c
This command creates a shared library called shim.so from the shim.c source code file. Then, if the test program was run with the previous LD_PRELOAD line, any calls to the function usb_open() in the test function call our library function instead. The code within our function tries to load the real libusb library with a call to the function dlopen:
handle = dlopen("/usr/lib/libusb.so",
RTLD_LAZY);
The option RTLD_LAZY is passed here, because I do not want the loader to resolve all symbols at this point in time. I want them resolved only when the shim code asks for them to be resolved.
If that function succeeds, the code then asks for a pointer to the real usb_open function with a call to dlsym:
libusb_open = dlsym(handle, "usb_open");
If that function succeeds, the shim now has a pointer to the real libusb function. It can call the real function whenever it wants to, after logging some information to the screen:
printf("calling usb_open(%p)\n", dev);
usb_handle = libusb_open(dev);
This also allows the code to do something after the library function has been called and before control is returned to the original program.
An example of running a program with this shim in place might produce the following output:
calling usb_open(002) usb_open() returned 0x8061100 calling usb_open(002) usb_open() returned 0x8061100 calling usb_open(002) usb_open() returned 0x8061100 calling usb_open(002) usb_open() returned 0x8061100 calling usb_open(002) usb_open() returned 0x8061120 calling usb_open(002) usb_open() returned 0x8061120
To log a more complex function, such as usb_control_message, the same thing needs to be done as was done for usb_open:
int usb_control_msg(usb_dev_handle *dev,
int requesttype,
int request,
int value,
int index,
char *bytes,
int size,
int timeout)
{
static int(*libusb_control_msg)
(usb_dev_handle *dev,
int requesttype, int request,
int value, int index, char *bytes,
int size, int timeout) = NULL;
void *handle;
int ret, i;
char *error;
if (!libusb_control_msg) {
handle = dlopen("/usr/lib/libusb.so", RTLD_LAZY);
if (!handle) {
fputs(dlerror(), stderr);
exit(1);
}
libusb_control_msg = dlsym(handle, "usb_control_msg");
if ((error = dlerror()) != NULL) {
fprintf(stderr, "%s\n", error);
exit(1);
}
}
printf("calling usb_control_msg(%p, %04x, "
"%04x, %04x, %04x, %p, %d, %d)\n"
"\tbytes = ", dev, requesttype,
request, value, index, bytes, size,
timeout);
for (i = 0; i < size; ++i)
printf ("%02x ", (unsigned char)bytes[i]);
printf("\n");
ret = libusb_control_msg(dev, requesttype,
request, value,
index, bytes, size,
timeout);
printf("usb_control_msg(%p) returned %d\n"
"\tbytes = ", dev, ret);
for (i = 0; i < size; ++i)
printf ("%02x ", (unsigned char)bytes[i]);
printf("\n");
return ret;
}
Running the test program again with the shim library loaded produces the following output:
usb_open() returned 0x8061100
calling usb_control_msg(0x8061100, 00c0, 0013, 6c7e, c41b, 0x80610a8, 8, 1000)
bytes = c9 ea e7 73 2a 36 a6 7b
usb_control_msg(0x8061100) returned 8
bytes = 81 93 1a c4 85 27 a0 73
calling usb_open(002)
usb_open() returned 0x8061120
calling usb_control_msg(0x8061120, 00c0, 0017, 9381, c413, 0x8061100, 8, 1000)
bytes = 39 83 1d cc 85 27 a0 73
usb_control_msg(0x8061120) returned 8
bytes = 35 15 51 2e 26 52 93 43
calling usb_open(002)
usb_open() returned 0x8061120
calling usb_control_msg(0x8061120, 00c0, 0017, 9389, c413, 0x8061108, 8, 1000)
bytes = 80 92 1b c6 e3 a3 fa 9d
usb_control_msg(0x8061120) returned 8
bytes = 80 92 1b c6 e3 a3 fa 9d
Using the LD_PRELOAD environment variable, it is easy to place your own code between a program and the libraries it is linked against.
Greg Kroah-Hartman currently is the Linux kernel maintainer for a variety of different driver subsystems. He works for IBM, doing Linux kernel-related things, and can be reached at greg@kroah.com.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- Web & UI Developer (JavaScript & j Query)
- UX Designer
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Nice article, thanks for the
2 hours 42 min ago - I once had a better way I
8 hours 28 min ago - Not only you I too assumed
8 hours 45 min ago - another very interesting
10 hours 38 min ago - Reply to comment | Linux Journal
12 hours 32 min ago - Reply to comment | Linux Journal
19 hours 26 min ago - Reply to comment | Linux Journal
19 hours 42 min ago - Favorite (and easily brute-forced) pw's
21 hours 33 min ago - Have you tried Boxen? It's a
1 day 3 hours ago - seo services in india
1 day 7 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?



Comments
Can you post that code
Can you post that code perchance?
I have found interesting
I have found interesting sources and would like to give the benefit of my experience to you.
I am tuning my pc by the best software for free, with the file search engine DornFall
May be you have your own experience and could give some useful sites too. Because this social site help me much.
A little thing you forgot
You wrote:
This command tells glibc...
Hmm, lets see:
$ man ld.so
NAME
ld.so/ld-linux.so - dynamic linker/loader
...
ENVIRONMENT
...
LD_PRELOAD
A whitespace-separated list of additional, user-specified, ELF
shared libraries to be loaded before all others. This can be
used to selectively override functions in other shared
libraries. For setuid/setgid ELF binaries, only libraries in
the standard search directories that are also setgid will be
loaded.
...
$
So, LD_PRELOAD affects the dynamic linker rather than glibc.
It's possible to override
It's possible to override dlopen and dlsym?
Yes
Yes, it is, but you need to do extra work if you want to call the real ones yourself (there are special linking options which can ahcieve this).
poor man's AOP ?
This strikes me as describing a fundamental way to implement AOP on theop of the Linux Kernel.
Anyone heaerd of people taking this idea further and actually trying to build an AOP implementation?
re: force a process to a specific CPU
Ahh, the teaser promises binding to a single CPU -- can you post that code perchance?
Thanks!
Nice example. A trick that be
Nice example. A trick that be usefull for many things. But this particular example could just as easily have been achived by running the ltrace command.
Re: Nice example. A trick that be
Ltrace is nice, but the LD_PRELOAD shim can do more things with the shimmed function(s). For example, it might only print out the trace message when certain conditions are met in the parameters. Or, every call to the shimmed function could scan target library internal data structures for corruption. Etc.
thats one of the problems wit
thats one of the problems with open source today, it became "too much", for every thing you want to do, there are a few ways
no problem :)
It's not a problem, to have a lot of options, it's flexibility and last not least, it's freedom.
It's an illusion of freedom
It's an illusion of freedom
I know you
Osama, is that you?