Embedding a File in an Executable, aka Hello World, Version 5967
I recently had the need to embed a file in an executable. Since I'm working at the command line with gcc, et al and not with a fancy RAD tool that makes it all happen magically it wasn't immediately obvious to me how to make this happen. A bit of searching on the net found a hack to essentially cat it onto the end of the executable and then decipher where it was based on a bunch of information I didn't want to know about. Seemed like there ought to be a better way...
And there is, it's objcopy to the rescue. objcopy converts object files or executables from one format to another. One of the formats it understands is "binary", which is basicly any file that's not in one of the other formats that it understands. So you've probably envisioned the idea: convert the file that we want to embed into an object file, then it can simply be linked in with the rest of our code.
Let's say we have a file name data.txt that we want to embed in our executable:
# cat data.txt Hello worldTo convert this into an object file that we can link with our program we just use objcopy to produce a ".o" file:
# objcopy --input binary \ --output elf32-i386 \ --binary-architecture i386 data.txt data.oThis tells objcopy that our input file is in the "binary" format, that our output file should be in the "elf32-i386" format (object files on the x86). The --binary-architecture option tells objcopy that the output file is meant to "run" on an x86. This is needed so that ld will accept the file for linking with other files for the x86. One would think that specifying the output format as "elf32-i386" would imply this, but it does not.
Now that we have an object file we only need to include it when we run the linker:
# gcc main.c data.oWhen we run the result we get the prayed for output:
# ./a.out Hello worldOf course, I haven't told the whole story yet, nor shown you main.c. When objcopy does the above conversion it adds some "linker" symbols to the converted object file:
_binary_data_txt_start _binary_data_txt_endAfter linking, these symbols specify the start and end of the embedded file. The symbol names are formed by prepending _binary_ and appending _start or _end to the file name. If the file name contains any characters that would be invalid in a symbol name they are converted to underscores (eg data.txt becomes data_txt). If you get unresolved names when linking using these symbols, do a hexdump -C on the object file and look at the end of the dump for the names that objcopy chose.
The code to actually use the embedded file should now be reasonably obvious:
extern char _binary_data_txt_start;
extern char _binary_data_txt_end;
char* p = &_binary_data_txt_start;
while ( p != &_binary_data_txt_end ) putchar(*p++);