GCC for Embedded Engineers
These components perform only a small slice of the work necessary to produce an executable. The preprocessor, for languages that support such a concept, runs before the compiler proper, performing text transformations before the compiler transforms the input into machine code for the target. During the compilation process, the compiler performs optimizations as specified by the user and produces a parse tree. The parse tree is translated into assembler code, and the assembler uses that input to make an object file. If the user wants to produce an executable binary, the object file is then passed to the linker to produce an executable.
After looking at all the components in a toolchain, the following section steps through the process GCC takes when compiling C source files into a binary. The process starts by invoking GCC with the files to be compiled and a parameter specifying output to be stored to thebinary:
armv5l-linux-gcc file1.c file2.c -o thebinary
GCC is actually a driver program that invokes the underlying compiler and binutils to produce the final executable. By looking at the extension of the input file and using the rules built in to the compiler, GCC determines what programs to run in what order to build the output. To see what happens in order to compile the file, add the -### parameter:
armv5l-linux-gcc -### file1.c file2.c -o thebinary
This produces virtual reams of output on the console. Much of the output has been clipped, saving untold virtual trees, to make it more readable for this example. The first information that appears describes the version of the compiler and how it was built—very important information when queried “was GCC built with thumb-interworking disabled?”
Target: armv5l-linux Configured with: <the contents of a autoconf command line> Thread model: posix gcc version 4.1.0 20060304 (TimeSys 4.1.0-3)
After outputting the state of the tool, the compilation process starts. Each source file is compiled with the cc1 compiler, the “real” compiler for the target architecture. When GCC was compiled, it was configured to pass certain parameters to cc1:
"/opt/timesys/toolchains/armv5l-linux/libexec/gcc/ ↪armv5l-linux/4.1.0/cc1.exe" "-quiet" "file1.c" ↪"-quiet" "-dumpbase" "file1.c" "-mcpu=xscale" ↪"-mfloat-abi=soft" "-auxbase" "file1" "-o" ↪"/cygdrive/c/DOCUME~1/GENESA~1.TIM/LOCALS~1/Temp/ccC39DVR.s"
Now the assembler takes over and turns the file into object code:
"/opt/timesys/toolchains/armv5l-linux/lib/gcc/ ↪armv5l-linux/4.1.0/../../../../armv5l-linux/bin/as.exe" ↪"-mcpu=xscale" "-mfloat-abi=soft" "-o" ↪"/cygdrive/c/DOCUME~1/GENESA~1.TIM/LOCALS~1/Temp/ccm4aB3B.o" ↪"/cygdrive/c/DOCUME~1/GENESA~1.TIM/LOCALS~1/Temp/ccC39DVR.s"
The same thing happens for the next file on the command line, file2.c. The command lines are the same as those for file1.c, but with different input and output filenames.
After compilation, collect2 performs a linking step and looks for initialization functions (called constructor functions, but not in the object-oriented sense) called before the “main” section of the program. collect2 gathers these functions together, creates a temporary source file, compiles it and links that to the rest of the program:
"/opt/timesys/toolchains/armv5l-linux/libexec/gcc/ ↪armv5l-linux/4.1.0/collect2.exe" "--eh-frame-hdr" ↪"-dynamic-linker" "/lib/ld-linux.so.2" "-X" "-m" ↪"armelf_linux" "-p" "-o" "binary" "/opt/timesys/ ↪toolchains/armv5l-linux/lib/gcc/armv5l-linux/ ↪4.1.0/../../../../armv5l-linux/lib/crt1.o" ↪"/opt/timesys/toolchains/armv5l-linux/lib/gcc/ ↪armv5l-linux/4.1.0/../../../../armv5l-linux/lib/crti.o" ↪"/opt/timesys/toolchains/armv5l-linux/lib/gcc/ ↪armv5l-linux/4.1.0/crtbegin.o" ↪"-L/opt/timesys/toolchains/armv5l-linux/lib/ ↪gcc/armv5l-linux/4.1.0" "-L/opt/timesys/ ↪toolchains/armv5l-linux/lib/gcc/armv5l-linux/ ↪4.1.0/../../../../armv5l-linux/lib" ↪"/cygdrive/c/DOCUME~1/GENESA~1.TIM/LOCALS~1/ ↪Temp/ccm4aB3B.o" "/cygdrive/c/DOCUME~1/ ↪GENESA~1.TIM/LOCALS~1/Temp/cc60Td3s.o" ↪"-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" ↪"-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" ↪"/opt/timesys/toolchains/armv5l-linux/lib/ ↪gcc/armv5l-linux/4.1.0/crtend.o" "/opt/timesys/ ↪toolchains/armv5l-linux/lib/gcc/armv5l-linux/ ↪4.1.0/../../../../armv5l-linux/lib/crtn.o"
There are some other nifty things in here that warrant pointing out:
1. Here's the option that specifies the dynamic linker to invoke when running the program on the target platform:
On Linux platforms, dynamically linked programs actually load by running a dynamic loader, making themselves a parameter of the linker, which does the work of loading the libraries into memory and fixing up the references. If this program isn't in the same place on the target machine, the program will fail to run with an “unable to execute program” error message. A misplaced linker on the target ensnares every embedded developer at least once.
2. These files contain the code before the programmer's entry point (typically main, but you can change that too) and handle things like initialization of globals, opening the standard file handles, making that nice array of parameters and other housekeeping functions:
3. Likewise, these files contain the code after the last return, such as closing files and other housekeeping work. Like the prior items, these are cross-compiled during the GCC build:
And, that's it! At the end of this process, the output is a program ready for execution on the target platform.