Examining the Compilation Process. Part 1.
This article, and the one to follow, are based on a Software Development class I taught a few years ago. The students in this class were non-programmers who had been hired to receive bug reports for a compiler product. As Analysts, they had to understand the software compilation process in some detail, even though some of them had never written a single line of code. It was a fun class to teach, so I'm hoping that the subject translates into interesting reading.
In this article, I'm going to discuss the process that the computer goes through to compile source code into an executable program. I won't be clouding the issue with the Make environment, or Revision Control, like I necessarily did in the class. For this article, we're only going to discuss what happens after you type gcc test.c.
Broadly speaking, the compilation process is broken down into 4 steps: preprocessing, compilation, assembly, and linking. We'll discuss each step in turn.
Before we can discuss compiling a program, we really need to have a program to compile. Our program needs to be simple enough that we can discuss it in detail, but broad enough that it exercises all of the concepts that I want to discuss. Here is a program that I hope fits the bill:
#include <stdio.h>
// This is a comment.
#define STRING "This is a test"
#define COUNT (5)
int main () {
int i;
for (i=0; i<COUNT; i++) {
puts(STRING);
}
return 1;
}
If we put this program in a file called test.c, we can compile this program with the simple command: gcc test.c. What we end up with is an executable file called a.out. The name a.out has some history behind it. Back in the days of the PDP computer, a.out stood for “assembler output.” Today, it simply means an older executable file format. Modern versions of Unix and Linux use the ELF executable file format. The ELF format is much more sophisticated. So even though the default filename of the output of gcc is “a.out,” its actually in ELF format. Enough history, let's run our program.
When we type ./a.out, we get:
This is a test This is a test This is a test This is a test This is a test
This, of course, doesn't come as a surprise, so let's discuss the steps that gcc went through to create the a.out file from the test.c file.
As mentioned earlier, the first step that the compiler does is it sends our source code through the C Preprocessor. The C Preprocessor is responsible for 3 tasks: text substitution, stripping comments, and file inclusion. Text substitution and file inclusion is requested in our source code using preprocessor directives. The lines in our code that begin with the “#” character are preprocessor directives. The first one requests that a standard header, stdio.h, be included into our source file. The other two request a string substitution to take place in our code. By using gcc's “-E” flag, we can see the results of only running the C preprocessor on our code. The stdio.h file is fairly large, so I'll clean up the results a little.
gcc -E test.c > test.txt
# 1 "test.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 28 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/features.h" 1 3 4
# 330 "/usr/include/features.h" 3 4
# 1 "/usr/include/sys/cdefs.h" 1 3 4
# 348 "/usr/include/sys/cdefs.h" 3 4
# 1 "/usr/include/bits/wordsize.h" 1 3 4
# 349 "/usr/include/sys/cdefs.h" 2 3 4
# 331 "/usr/include/features.h" 2 3 4
# 354 "/usr/include/features.h" 3 4
# 1 "/usr/include/gnu/stubs.h" 1 3 4
# 653 "/usr/include/stdio.h" 3 4
extern int puts (__const char *__s);
int main () {
int i;
for (i=0; i<(5); i++) {
puts("This is a test");
}
return 1;
}
The first thing that becomes obvious is that the C Preprocessor has added a lot to our simple little program. Before I cleaned it up, the output was over 750 lines long. So, what was added, and why? Well, our program requested that the stdio.h header be included into our source. Stdio.h, in turn, requested a whole bunch of other header files. So, the preprocessor made a note of the file and line number where the request was made and made this information available to the next steps in the compilation process. Thus, the lines,
# 28 "/usr/include/stdio.h" 3 4 # 1 "/usr/include/features.h" 1 3 4
indicates that the features.h file was requested on line 28 of stdio.h. The preprocessor creates a line number and file name entry before what might be “interesting” to subsequent compilation steps, so that if there is an error, the compiler can report exactly where the error occurred.
When we get to the lines,
# 653 "/usr/include/stdio.h" 3 4 extern int puts (__const char *__s);
We see that puts() is declared as an external function that returns an integer and accepts a single constant character array as a parameter. If something were to go horribly wrong with this declaration, the compiler could tell us that the function was declared on line 653 of stdio.h. It's interesting to note that puts() isn't defined, only declared. That is, we don't get to see the code that actually makes puts() work. We'll talk about how puts(), and other common functions get defined later.
Also notice that none of our program comments are left in the preprocessor output, and that all of the string substitutions have been performed. At this point, the program is ready for the next step of the process, compilation into assembly language.
We can examine the results of the compilation process by using gcc's -S flag.
gcc -S test.c
This command results in a file called test.s that contains the assembly code implementation of our program. Let's take a brief look.
.file "test.c"
.section .rodata
.LC0:
.string "This is a test"
.text
.globl main
.type main, @function
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $20, %esp
movl $0, -8(%ebp)
jmp .L2
.L3:
movl $.LC0, (%esp)
call puts
addl $1, -8(%ebp)
.L2:
cmpl $4, -8(%ebp)
jle .L3
movl $1, %eax
addl $20, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
.ident "GCC: (GNU) 4.2.4 (Gentoo 4.2.4 p1.0)"
.section .note.GNU-stack,"",@progbits
My assembly language skills are a bit rusty, but there are a few features that we can spot fairly readily. We can see that our message string has been moved to a different part of memory and given the name .LC0. We can also see that there are quite a few steps needed to start and exit our program. You might be able to follow the implementation of the for loop at .L2; it's simply a comparison (cmpl) and a “Jump if Less Than” (jle) instruction. The initialization was done in the movl instruction just above the .L3 label. The call to puts() is fairly easy to spot. Somehow the Assembler knows that it can call the puts() function by name and not a funky label like the rest of the memory locations. We'll discuss this mechanism next when we talk about the final stage of compilation, linking. Finally, our program ends with a return (ret).
The next step in the compilation process is to assemble the resulting Assembly code into an object file. We'll discuss object files in more detail when we discuss linking. Suffice it to say that assembling is the process of converting (relatively) human readable assembly language into machine readable machine language.
Linking is the final stage that either produces an executable program file or an object file that can be combined with other object files to produce an executable file. It's at the link stage that we finally resolve the problem with the call to puts(). Remember that puts() was declared in stdio.h as an external function. This means that the function will actually be defined, or implemented, elsewhere. If we had several source files in our program, we might have declared some of our functions as extern and implemented them in different files; such functions would be available anywhere in our source files by nature of having been declared extern. Until the compiler knows exactly where all of these functions are implemented, it simply uses a place-holder for the function call. The linker will resolve all of these dependencies and plug in the actual address of the functions.
The linker also does a few additional tasks for us. It combines our program with some standard routines that are needed to make our program run. For example, there is standard code required at the beginning of our program that sets up the running environment, such as passing in command-line parameters and environment variables. Also, there is code that needs to be run at the end of our program so that it can pass back a return code, among other tasks. It turns out that this is no small amount of code. Let's take a look.
If we compile our example program, as we did above, we get an executable file that is 6885 byes in size. However, if we instruct the compiler to not go through the linking stage, by using the -c flag (gcc -c test.c -o test.o), we get an object module that is 888 bytes in size. The difference in file size is the code to startup and terminate our program, along with the code that allows us to call the puts() function in libc.so.
At this point, we've looked at the compilation process in some detail. I hope this has been interesting to you. Next time, we'll discuss the linking process in a bit more detail and consider some of the optimization features that gcc provides.
Mike Diehl is a freelance Computer Nerd specializing in Linux administration, programing, and VoIP. Mike lives in Albuquerque, NM. with his wife and 3 sons. He can be reached at mdiehl@diehlnet.com
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- New Products
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- A Topic for Discussion - Open Source Feature-Richness?
- New Products
- New Products
- The Pari Package On Linux
- Home, My Backup Data Center
- This is the easiest tutorial
3 hours 23 min ago - Ahh, the Koolaid.
9 hours 1 min ago - git-annex assistant
15 hours 1 min ago - direct cable connection
15 hours 24 min ago - Agreed on AirDroid. With my
15 hours 34 min ago - I just learned this
15 hours 38 min ago - enterprise
16 hours 8 min ago - not living upto the mobile revolution
18 hours 59 min ago - Deceptive Advertising and
19 hours 35 min ago - Let\'s declare that you have
19 hours 36 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
Good explanation
This is my first search to find out what happens during compilation. Thank you very much for such a nice explanation that came with an example too ! I'm going to recommend this to my classmates too....!
Assembly language corrections.
jle instruction means "Jump if Less or Equal". And its operands are $4 and %ebp-8 (variable i). '$' sign means numeric constant (here is 4 in contrast to 5 which was specified in C source code).
Thanks
Thanks ... Waiting for part 2 ...
Clear on basic concept
Good explanation on basic of compilation process..
intel asm is easier to read
I never much cared for the AT&T syntax, most probably because I was brought up on a sickly diet of Windows, many years ago.
You can use -S -masm=intel to get Intel format from gcc/g++. I still think it's more readable. objdump, gdb, ... also can output intel asm. Check the man pages.
To outline sections, my understanding is:
.text is the code that goes in the ELF .text segment (obvious from output)
.lcomm / .comm end up in the ELF .bss segment (uninitialised data)
.data ends up in ELF .data segment (initialised global data)
It would also probably be worth writing a full article on compiler internals. What's here doesn't really cover compilation at all. It covers external programs that the front end / driver invokes - such as preprocessor, linker, and assembler.
thanks for the article,
thanks for the article, could have also discussed the segments/sections like .rodata .LC0 a bit more.
Keep'em coming!
Liked the article a lot. Can hardly wait for parts 2, 3...
Thanks!
Thanks for posting this ...
Thanks for posting this ... I've passed it on to one of my nephews (son of a long-time friend, actually) who has programmed a bit in Python but wants to learn C/C++. He's taking "Programming 101" at school, but I don't think the professor is going to discuss things at the level you have.
Looking forward to the next part (as his he) ...
Saving temporary files at compile time
Mike, congratulations for the article!
I'd like only make a append for users that like to see in one shot all files generated by gcc at compile time. Just use save-temps parameter.
i.e:
# gcc -o foo -save-temps foo.c
Best regards,
Tiago
HTML Entities problem...
You forgot to convert < characters into < in your example program, it doesn't show up properly in the article.