Playing with Binary Formats
One of the roles that kernel modules can accomplish is adding new binary formats to a running system. A “binary format” is basically a data structure responsible for executing program files—the ones marked with execute permission. The code I'm going to introduce runs with version 2.0 of the kernel.
Kernel modules are meant to add new capabilities to a Linux system, device drivers being the best known such “capabilities”. As a matter of fact, the highly modular design of the Linux kernel allows run-time insertion of many features other than device drivers—we saw a few months ago how /proc files and sysctl entry points can be created by modularized code.
One other such loadable feature is the ability to execute a binary format; this includes both executable files and shared libraries. While the mechanism of loading compiled program files and shared libraries is quite elaborate, the average Linux user can easily add loaders that invoke an interpreter for new binary formats. Thus, the user is able to call data files by name and have them “executed”, after invoking chmod +x on the file.
Let's start this discussion by looking at how the exec system call is implemented in Linux. This is an interesting part of the kernel, as the ability to execute programs is at the basis of system operations.
The entry point of exec lives in the architecture-dependent tree of the source files, but all the interesing code is part of fs/exec.c (all pathnames here refer to /usr/src/linux/ or the location of your sources). To check architecture-specific details, locate the function by typing the command:
arch/*/kernel/*.c
Within fs/exec.c the toplevel function, do_execve(), is less than fifty lines of code in length. Its role is checking for errors, filling the “binary parameter” structure (struct linux_binprm) and looking for a binary handler. The last step is performed by search_binary_handler(), another function in the same file. The magic of do_execve() is contained in this last function which is very short. Its job consists of scanning a list of registered binary formats, passing the binprm structure to all of them until one succeeds. If no handler is able to deal with the executable file, the kernel tries to load a new handler via kerneld and scans the list once again. If no binary format is able to run the executable file, the system call returns the ENOEXEC error code (“Exec format error”).
The main problem with this kind of implementation is in keeping Linux compatible with the standard Unix behaviour. That is, any executable text file that begins with #! must be executed by the interpreter it asks for, and any other executable text is run by /bin/sh. The former issue is easily dealt with by a binary format specialized in running interpreter files (fs/binfmt_script.c), and the interpreter itself is run by calling search_binary_handler() once again. This function is designed to be reentrant, and binfmt_script checks against double invocation. The latter issue is mainly an historical relic and is simply ignored by the kernel. The program trying to execute the file takes care of it. Such a program is usually your shell or make. It's interesting to note that while recent versions of gmake execute properly when a script has no leading #! line, previous versions didn't call a shell resulting in a “cannot execute binary file” message when running unadorned scripts from within a Makefile.
All the magic handling of data structures needed to replace the old executable image with the new one is performed by the specific binary loader, based on utility functions exported by the kernel. If you would like to take a look at such code, the function load_out_binary() in fs/binfmt_aout.c is a good place to start—easier than the ELF loader.
The implementation of exec is interesting code, but Linux has more to offer: registration of new binary formats at run time. The implementation is quite straightforward, although it involves mucking with rather elaborate data structures—either the code or the data structures must accomodate the underlying complexities; elaborate data structures offer more flexibility than elaborate code.
The core of a binary format is represented in the kernel by a structure called struct<\!s>linux_binfmt, which is declared in the linux/binfmts.h file as follows:
struct linux_binfmt {
struct linux_binfmt *next;
long *use_count;
int (*load_binary)(struct linux_binprm *,
struct pt_regs *);
int (*load_shlib)(int fd);
int (*core_dump)(long signr,
struct pt_regs *);
};
The three functions, or “methods”, declared by the binary format are used to execute a program file, to load a shared library and to create a core file. The next pointer is used by search_binary_handler(), while the use_count pointer keeps track of the usage count of modules. Whenever a process p is executing in the realm of a modularized binary format, the kernel keeps track of *(p->binfmt->use_count) to prevent unexpected removal of the module.
A module, then, uses the following functions to load and unload itself:
extern int register_binfmt(struct linux_binfmt *); extern int unregister_binfmt(struct linux_binfmt *);
The functions receive a single argument instead of the usual pointer,name pair because no file in the /proc directory lists the available binary formats. The typical code for loading and unloading a binary format, therefore, is as simple as the following:
int init_module (void) {
return register_binfmt(&bluff_format);
}
void cleanup_module(void) {
unregister_binfmt(&bluff_format);
}
The previous lines belong to the
bluff module (Binary Loader for an
Ultimately Fallacious Format), whose source is available for public
download from ftp://ftp.linuxjournal.com/pub/lj/listings/issue45/2568.tgz.
The structure representing the binary format can declare as NULL any of the functions it offers; NULL functions are simply ignored by the kernel. The easiest binary format, therefore, looks like the following, which is the one used by the bluff module:
struct linux_binfmt bluff_format = {
NULL, &mod_use_count_, /* next, count */
NULL, NULL, NULL /* bin, lib, core */
};
Yes, bluff is a bluff; you can load and unload it at will, but it does absolutely nothing.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- The Secret Password Is...
- RSS Feeds
- New Products
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




44 min 45 sec ago
47 min 52 sec ago
49 min 12 sec ago
5 hours 13 min ago
7 hours 4 min ago
12 hours 18 min ago
15 hours 29 min ago
17 hours 45 min ago
18 hours 13 min ago
19 hours 11 min ago