Examining the compilation process. part 3.

The last two articles that I wrote for Linuxjournal.com were about the steps that GCC goes through during the compilation process and were based on a software development class I taught a few years ago. I hadn't intended for this to be a three part series, but it's been pointed out that I didn't cover the make utility and I think it's almost negligent to discuss software development and not discuss make. Since I don't like to think of myself as negligent, I decided to extend the series to one more article.

If you only have a simple project with less than, say, 5 source files, you probably don't need to use the make utility. You can simply compile your project with a shell script like the one I'm using for a small 3D graphics program I'm writing as part of another article for Linux Journal. Take a look.

g++ ./game.cpp -lIrrlicht -lGL -lXxf86vm -lXext -lX11 -ljpeg -lpng -o game

This is pretty easy. We have a single source file, a few libraries and a final executable. Scripting the compilation of this project saves us from having to type that mammoth command line in every time we make a change to the program.

But what happens when your project has several, perhaps hundreds of source files of hundreds or thousands of lines of code each? Sometimes, it can take several minutes or even hours to compile a very large project. What if you find an important bug in your program and have to make this change:?

Change a = a+b;

into a = a-b;

This one character change means you have to recompile the whole project! Now while waiting for a project to recompile is a wonderful excuse to go drink a beer, it's a very inefficient use of your time as a software developer.

The ability to only recompile those parts of our project that actually need to be recompiled is what we get from using the make utility. Let's take a look at how this works.

The make utility uses a file called a Makefile to determine which parts of our project need to be recompiled. Essentially, the Makefile specifies which files depend on each other, and how to regenerate a given file if we need to. Let's take a look at an example. Here is a simple Makefile.

main: main.o f.o
gcc main.o f.o -o main

main.o: main.c
gcc -c main.c -o main.o

f.o: f.c
gcc -c f.c -o f.o

The general format of a Makefile is a target, a list of dependencies for that target, and a command line used to regenerate the target if any of the dependencies have been modified.

From this Makefile, we can see a few things. For example, the executable file, main, is dependent upon both main.o and f.o. When we need to regenerate main, perhaps because either main.o or f.o have changed, we use the command, “gcc main.o f.o -o main”. In turn, main.o is dependent upon main.c. When a change is made to main.c, main.o needs to be regenerated and the Makefile tells us how this is done. It turns out that if main.c changes and we have to regenerate main.o, we also have to relink the results in order to regenerate the final executable. Make takes care of the recursive nature of the problem for us.

The f.o target is similar.

After creating this Makefile and typing make for the first time, we had to recompile the entire project from scratch:

# make
gcc -c main.c -o main.o
gcc -c f.c -o f.o
gcc main.o f.o -o main

This resulted in an executable call “main.” However, let's say we make a change to f.c. When we rerun make, we see this:

# make
gcc -c f.c -o f.o
gcc main.o f.o -o main

We see that main.c isn't recompiled this time since it didn't change. The file, f.c is recompiled into f.o and then relinked with the existing main.o. The result is another, updated, executable called main.

If for some reason, we wanted to recompile main.c, we could ask make to do it for us by typing:

make main.o

In this case, make would consult the Makefile in the current directory and figure out what needs to be done in order to regenerate the main.o target.

So by using the make utility, we can save ourselves from having to watch as, potentially, several unnecessary recompilations take place.

If that was all we could expect from the make utility, it would be a tremendous time saver. But there is more. Make allows us to define variables and use them within our Makefile. For example, take a look at an excerpt from one of my other projects:

OBJ = network.o config.o protocol.o parsers.o events.o users.o
CXXFLAGS = -O3 -ffast-math -Wall
LDFLAGS = -lenet

game: $(OBJ) main.cpp
main.cpp $(OBJ) $(LDFLAGS) -o game

Here we see a few things of interest. First, we define a few variables, OBJ, CPPFLAGS, CXXFLAGS, LDFLAGS. These variables are then used later in the Makefile where we describe how to remake the “game” target.

For the sake of clarity, let's see what happends to the command line specified in this Makefile snippet. We start out with:
g++ $(CPPFLAGS) $(CXXFLAGS) main.cpp $(OBJ) $(LDFLAGS) -o game

Se can see the references to the variables that we defined earlier, so let's go ahead and substitute their values in. When we do, we end up with:

g++ -DTEXT_CLIENT -O3 -ffast-math -Wall main.cpp network.o config.o protocol.o parsers.o events.o users.o -lenet

I think you get the idea about how variables work inside a Makefile. In real life, variables might be used in several places in the Makefile and thus save us a lot of time and spare us from the potential of making a trivial, but devastating typing mistake.

We also see that we can make the command line easier to read by ending it with the '\' character and continuing it on the next line. It's a simple fact: Any code that is easy to read is less prone to errors, and our Makefile is no different.

Ok, so far, the make utility sounds like a really cool thing. But what kinds of problems can we have with it?

The common mistake people make... with make, is that they leave out a dependency. For example, let's say you have a file, foo, that depends upon another file, bar.o, but you forget to list it in the dependencies for foo.

Now, if bar.o doesn't exist, you will simply get some linking error messages and you'll know right away that you've left something out.

However, what if you've been compiling by hand until the project grew large enough to warrant using make? Now bar.o already exists, but isn't mentioned as a dependency for foo. In this case, everything will seem to work just fine, until you find a bug in bar.o. So, you go into the files that are used to generate bar.o and you fix your bug and recompile. You find that you have the same symptoms. You think that maybe you forgot to save your changes so you do it again. Same bug. This time you put a few debugging print statements in your code and recompile. Same bug and NO DEBUGGING OUTPUT! If you happen to be prone to swearing, this is where it begins. Fortunately, once you've made this mistake once, you tend to remember it, and the symptoms and you don't get bitten again.

Header files, or .h files if you write in C, can lead to some problems with make. In this case, it's common to have header files that contain the prototypes of all of your external functions and data types. Many times, programmers get lazy and put ALL of their prototypes in a single file, which causes them to have to include this file in all of the rest of their source files. The header file becomes a dependency for every file in the project. In this case, the programmer has created a situation where any change to that file requires the entire project to be recompiled. Some times, this is just the nature of the problem at hand, other times, that single header file could be split up into separate files for use in separate modules.

It's quite common for a programmer to use make to provide himself with the luxury of removing all executables and object files from his project, thus necessitating a complete recompile. Typically, a programmer would add a target like this to his Makefile:

rm *.o main

Then the programmer can simply type “make clean” and get a completely clean slate. Similarly, it's common to have an “install” target so that an end user can type “make install” and have the software automatically installed for them.

As you can see, the make utility is a wonderful time saver. It can help a programmer ensure that the files that need to be recompiled, are recompiled, and that only those files that need to be recompiled, are recompiled. This article wasn't as detailed as the previous two in this series, but I hope this artciel rounds out our discussion of the compilation process.


Mike Diehl is a freelance Computer Nerd specializing in Linux administration, programing, and VoIP. Mike lives in Albuquerque, NM. with his wife and 3 sons. He can be reached at mdiehl@diehlnet.com


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Avoiding redundancy in lists of .c and .o filenames

O.A.K.'s picture

Excellent tutorial. There's still one important feature of 'make' that needs to be mentioned too, I think.
What if we would like 'make' to perform some magic on source and object filenames, so that we DON'T need to type the similar filenames (differing only in their extension such as .cpp versus .o) in multiple places in the Makefile?

I want to give a list of source files (.cpp) and 'make' should understand that for each source xxx.cpp (or xxx.c, for example) there will be the corresponding object file xxx.o.
In other words: 'make' should create foo.o and bar.o, when it knows that foo.cpp and bar.cpp are my source files but the names foo.o and bar.o are not explicitly given in the Makefile.
This is important, because if I modify the source-file list of my (big) project, I want to modify only one list within the Makefile, not two (or more) lists. I don't want to type any redundant information in my Makefile.

The solution should be something like this:
SOURCES=foo.cpp bar.cpp
ar r $(LIBRARY).a $(OBJECTS)
$(CC) $(CFLAGS) $< -o $@

At least this kind of solution has worked in some projects where I have done programming.
I'm not sure, though, if the availability of this functionality depends on the version of 'make' that is being used.
Please point out any mistakes in my comment :)

Please also note that there should be a TAB character at the beginning of every line that specifies a shell command (such as calling the compiler).
At least the TAB is required by the classic version of 'make'.

Yes, that will work

Paul Smith's picture

More or less as you've described although there are a few unnecessary things (for example, you don't need to declare $(SOURCES) as a prerequisite of "all") and a typo or two (for example, you either need to add the ".a" suffix into the LIBRARY variable, or put it in the target: $(LIBRARY).a: $(OBJECTS))

However, there are even better things about make. For example, make has builtin rules to compile all sorts of things, including building .cpp files into .o files. So you don't even need to declare a rule. You can just write:
SOURCES = foo.cpp bar.cpp
LIBRARY = libfoorulez.a
all: $(LIBRARY)
$(LIBRARY): $(SOURCES:.cpp=.o)
> $(AR) r $@ $^

(replace the > with a TAB). That's it; make knows how to do the rest by itself. You can control details of compilation by setting variables like CC or CXX for the C and C++ compilers respectively. Note how we've used the automatic variables $@ and $^ in the rule to avoid retyping the target and prerequisite lists.

There are tons of extra interesting features in GNU make. The manual is quite readable and informative.

pretty good again.

Lutieri G B's picture

pretty good again. congratulations!

make & Makefile

H I Murphy's picture

What a coincidence! It was only a few weeks ago that I filled out a survey from one of the online Linux magazines, maybe it was Linux Journal, and one of the subjects I asked to have explained was the "make" utility.
While not a professional programmer, I have worked in software development for the past 15 to 20 years in system design, configuration management using Vax/VMS Ada, software testing, and documentation. I had sort of a vague idea of what "make" was about.
Thanks for clearing it up; you guys are alllllright.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState