Linkers and Loaders
Linking is the process of combining various pieces of code and data together to form a single executable that can be loaded in memory. Linking can be done at compile time, at load time (by loaders) and also at run time (by application programs). The process of linking dates back to late 1940s, when it was done manually. Now, we have linkers that support complex features, such as dynamically linked shared libraries. This article is a succinct discussion of all aspects of linking, ranging from relocation and symbol resolution to supporting position-independent shared libraries. To keep things simple and understandable, I target all my discussions to ELF (executable and linking format) executables on the x86 architecture (Linux) and use the GNU compiler (GCC) and linker (ld). However, the basic concepts of linking remain the same, regardless of the operating system, processor architecture or object file format being used.
Consider two program files, a.c and b.c. As we invoke the GCC on a.c b.c at the shell prompt, the following actions take place:
gcc a.c b.c
Run preprocessor on a.c and store the result in intermediate preprocessed file.
cpp other-command-line options a.c /tmp/a.i
Run compiler proper on a.i and generate the assembler code in a.s
cc1 other-command-line options /tmp/a.i -o /tmp/a.s
Run assembler on a.s and generate the object file a.o
as other-command-line options /tmp/a.s -o /tmp/a.o
cpp, cc1 and as are the GNU's preprocessor, compiler proper and assembler, respectively. They are a part of the standard GCC distribution.
Repeat the above steps for file b.c. Now we have another object file, b.o. The linker's job is to take these input object files (a.o and b.o) and generate the final executable:
ld other-command-line-options /tmp/a.o /tmp/b.o -o a.out
The final executable (a.out) then is ready to be loaded. To run the executable, we type its name at the shell prompt:
The shell invokes the loader function, which copies the code and data in the executable file a.out into memory, and then transfers control to the beginning of the program. The loader is a program called execve, which loads the code and data of the executable object file into memory and then runs the program by jumping to the first instruction.
a.out was first coined as the Assembler OUTput in a.out object files. Since then, object formats have changed variedly, but the name continues to be used.
Linkers and loaders perform various related but conceptually different tasks:
Program Loading. This refers to copying a program image from hard disk to the main memory in order to put the program in a ready-to-run state. In some cases, program loading also might involve allocating storage space or mapping virtual addresses to disk pages.
Relocation. Compilers and assemblers generate the object code for each input module with a starting address of zero. Relocation is the process of assigning load addresses to different parts of the program by merging all sections of the same type into one section. The code and data section also are adjusted so they point to the correct runtime addresses.
Symbol Resolution. A program is made up of multiple subprograms; reference of one subprogram to another is made through symbols. A linker's job is to resolve the reference by noting the symbol's location and patching the caller's object code.
So a considerable overlap exists between the functions of linkers and loaders. One way to think of them is: the loader does the program loading; the linker does the symbol resolution; and either of them can do the relocation.
Object files comes in three forms:
Relocatable object file, which contains binary code and data in a form that can be combined with other relocatable object files at compile time to create an executable object file.
Executable object file, which contains binary code and data in a form that can be directly loaded into memory and executed.
Shared object file, which is a special type of relocatable object file that can be loaded into memory and linked dynamically, either at load time or at run time.
Compilers and assemblers generate relocatable object files (also shared object files). Linkers combine these object files together to generate executable object files.
Object files vary from system to system. The first UNIX system used the a.out format. Early versions of System V used the COFF (common object file format). Windows NT uses a variant of COFF called PE (portable executable) format; IBM uses its own IBM 360 format. Modern UNIX systems, such as Linux and Solaris use the UNIX ELF (executable and linking format). This article concentrates mainly on ELF.
The above figure shows the format of a typical ELF relocatable object file. The ELF header starts with a 4-byte magic string, \177ELF. The various sections in the ELF relocatable object file are:
.text, the machine code of the compiled program.
.rodata, read-only data, such as the format strings in printf statements.
.data, initialized global variables.
.bss, uninitialized global variables. BSS stands for block storage start, and this section actually occupies no space in the object file; it is merely a placer holder.
.symtab, a symbol table with information about functions and global variables defined and referenced in the program. This table does not contain any entries for local variables; those are maintained on the stack.
.rel.text, a list of locations in the .text section that need to be modified when the linker combines this object file with other object files.
.rel.data, relocation information for global variables referenced but not defined in the current module.
.debug, a debugging symbol table with entries for local and global variables. This section is present only if the compiler is invoked with a -g option.
.line, a mapping between line numbers in the original C source program and machine code instructions in the .text section. This information is required by debugger programs.
.strtab, a string table for the symbol tables in the .symtab and .debug sections.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Roll your own dynamic dns
3 hours 49 min ago
- Please correct the URL for Salt Stack's web site
7 hours 54 sec ago
- Android is Linux -- why no better inter-operation
9 hours 16 min ago
- Connecting Android device to desktop Linux via USB
9 hours 44 min ago
- Find new cell phone and tablet pc
10 hours 42 min ago
12 hours 11 min ago
- Automatically updating Guest Additions
13 hours 20 min ago
- I like your topic on android
14 hours 6 min ago
- This is the easiest tutorial
20 hours 42 min ago
- Ahh, the Koolaid.
1 day 2 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?