The Compiler as Attack Vector

Can an attacker build a compromised program from good source code? Yes, if he or she controls the tools. Learn how an attack can happen during the build process.

Now that we have completed a general introduction to GCC and the parts of interest, we can apply the knowledge to attacks. The simplest attack is to add new functionality, evoked by a command-line option. Let's attack libc-start.c, because it is easier to wait for command-line options to be set up for us rather than by doing it with our own code.

This type of work should be done on a machine of little importance, so that it can be re-installed when necessary. The version of glibc used here is 2.3.1, built on Mandrake 9.1. After the initial build, which will be lengthy, as long as the build isn't cleaned, future compiles should be pretty quick.

The first example makes simple text appear before and after the main body executes. In order to do this, the library that is linked in by the compiler is modified. The modifications to libc-start.c simply add a hello and good-bye message that is displayed as the program runs. The modifications include adding stdio.h as a header and two simple printf statements before and after main, as shown in Listing 2. With these simple changes made, kick off another build of glibc and wait.

Waiting until the build is finished is not necessary. You can build programs from the compile directory without risking machine usability due to a faulty glibc install. Doing this requires some tricky command-line options to GCC. For simplicity of demonstration, the binary is built statically, as shown in Listing 3. The program compiled is a simple Hello World program.

Pay close attention to nostdlib, nostartfiles and static. These options are followed by the paths of libraries for the common C library, as well as standard libs like -lgcc. These strange options instruct GCC not to build in the standard libraries and startup functions. This allows us to specify exactly what we want linked in and where. After the compile is complete, we are left with a hello ELF binary as expected, but it is much larger than normal. This is a side effect of building the program statically, meaning that the required functions are built within the program, rather than relying on them to be loaded on an as-needed basis. Running the binary results in our messages being displayed before and after the hello world message, and it verifies that we can indeed execute code before the developer intends.

A real attacker would not have to build statically and could subvert the system copy of glibc in place so that executables would look normal.

Looking back at the libc-start source file, it's easy to tell that this function sets up argc, argv and evnp before calling main(). Moving on from displaying text, the execution of a shell is the next step. Because modifications of this gravity are such that an attacker would not want someone to know they exist, this shell executes only if the correct command-line option is passed. The source file already includes unistd.h, so it is simple and tempting to use getopt to parse the command-line options before main() is called. Although this will work, it can lead to discovery if getopt errors out due to unknown options. I wrote a brief snippet of code that searches argv for the option to invoke the shell, as shown in Listing 4. When you exit the shell, you will notice the program continues operating normally. Unless you knew the option used to start the shell, more than likely you never would have known this back door existed.

The previous examples are interesting, but they really don't do anything noteworthy. The next example adds a unique identifier to every binary built with GCC. This is most useful in honeypot-like environments where it is possible an unknown party will build a program on the machine, then remove it. The unique identifier, coupled with a registry, can help a forensics analyst trace a program back to its point of origin and establish a trail to the intruder.