The Compiler as Attack Vector

Can an attacker build a compromised program from good source code? Yes, if he or she controls the tools. Learn how an attack can happen during the build process.

There could be much debate about what the unique identifier should be and how it should be generated. To avoid a trip to Crypto 101, the identifier is a generic 26-character string. To prevent immediate detection, the identifier is added as a void function that is visible using nm. Its name is __ID_abcdefghijklmnopqrstuvwxyz(). This is added to libc-start.c. After rebuilding glibc and compiling the test program, the value is visible. The value I chose is for demonstration purposes. In reality, the more obscure and legitimate sounding the identifier, the harder it is to detect. My choice for a name in a real scenario would be something like __dl_sym_check_load(). In addition to tagging the binary at build, a token could be inserted that would create a single UDP packet, with the only payload being the IP address of the machine on which it is running. This could be sent to a logging server that could track what binaries are run in what places and where they were built.

One of the more interesting elements of this attack vector is the ability to make good code bad. strcpy is a perfect example of this function, because it has both an unsafe version and a safe one, strncpy, which has an additional argument indicating how much of a string should be copied. Without reviewing how a buffer overflow works, strcpy is far more desirable to an attacker than its bounds-checking big brother. This is a relatively simple change that should not attract too much attention, unless the program is stepped through with a debugger. In the directory <glibc-base>/sysdeps/generic, there are two files, strcpy.c and strncpy.c. Comment out everything strncpy does and replace it with return strcpy(s1,s2);.

Using GDB, you can verify that this actually works by writing a snippet of code that uses strncpy, and then single stepping through it. An easier way to verify this is to copy a large string into a small buffer and wait for a crash like the one shown in Listing 6.

Depending on the function of the code, it may be useful only if it is undiscovered. To help keep it a secret, adding conditional execution code is useful. This means the added code remains dormant if a certain set of circumstances are not met. An example of this is to check whether the binary is built with debug options and, if so, do nothing. This helps keep the chances of discovery low, because a release application might not get the same scrutiny as a debug application.

Defense and Wrap-Up

Now that the whats and the hows of this vector have been explored, the time has come to discuss ways to discover and stop these sorts of attacks. The short answer is that there is no good way. Attacks of this sort are not aimed at compromising a single box but rather at dispersing trojaned code to the end user. The examples shown thus far have been trivial and are intended to help people grasp the concepts of the attack. However, without much effort, truly dangerous things could emerge. Some examples are modifying gpg to capture passphrases and public keys, changing sshd to create copies of private keys used for authentication, or even modifying the login process to report user name and passwords to a third-party source. Defending against these types of attacks requires diligent use of host-based intrusion-detection systems to find modified system libraries. Closer inspection at build time also must play a crucial role. As you may have discovered looking at the examples above, most of the changes will be made blatantly obvious in a debugger or by using tools like binutils to inspect the final binary.

One more concrete method of defense involves profiling all functions occurring before and after main executes. In theory, the same versions of glibc on the same machine should behave identically. A tool that keeps a known safe state of this behavior and checks newly built binaries will be able to detect many of these changes. Of course, if attackers knew a tool like that existed, they would try to evade it using code that would not execute in a debugger environment. The most important bit of knowledge to take away from this article is not the internal workings of glibc and GCC or how unknown modifications can affect a program without alerting the developer or the end user. The most important thing is that, in this day and age, anything can be used as a tool to undermine security—even the most trustworthy staples of standard computing.

Resources for this article:

David Maynor is a research engineer with the ISS Xforce R&D team. He spends his day thinking of new ways to break things before the bad guys do. He can be reached at


Free Dummies Books
Continuous Engineering


  • What continuous engineering is
  • How to continuously improve complex product designs
  • How to anticipate and respond to markets and clients
  • How to get the most out of your engineering resources

Get your free book now

Sponsored by IBM

Free Dummies Books
Service Virtualization

Learn to:

  • Define service virtualization
  • Select the most beneficial services to virtualize
  • Improve your traditional approach to testing
  • Deliver higher-quality software faster

Get your free book now

Sponsored by IBM