Playing with Binary Formats
One of the roles that kernel modules can accomplish is adding new binary formats to a running system. A “binary format” is basically a data structure responsible for executing program files—the ones marked with execute permission. The code I'm going to introduce runs with version 2.0 of the kernel.
Kernel modules are meant to add new capabilities to a Linux system, device drivers being the best known such “capabilities”. As a matter of fact, the highly modular design of the Linux kernel allows run-time insertion of many features other than device drivers—we saw a few months ago how /proc files and sysctl entry points can be created by modularized code.
One other such loadable feature is the ability to execute a binary format; this includes both executable files and shared libraries. While the mechanism of loading compiled program files and shared libraries is quite elaborate, the average Linux user can easily add loaders that invoke an interpreter for new binary formats. Thus, the user is able to call data files by name and have them “executed”, after invoking chmod +x on the file.
Let's start this discussion by looking at how the exec system call is implemented in Linux. This is an interesting part of the kernel, as the ability to execute programs is at the basis of system operations.
The entry point of exec lives in the architecture-dependent tree of the source files, but all the interesing code is part of fs/exec.c (all pathnames here refer to /usr/src/linux/ or the location of your sources). To check architecture-specific details, locate the function by typing the command:
arch/*/kernel/*.c
Within fs/exec.c the toplevel function, do_execve(), is less than fifty lines of code in length. Its role is checking for errors, filling the “binary parameter” structure (struct linux_binprm) and looking for a binary handler. The last step is performed by search_binary_handler(), another function in the same file. The magic of do_execve() is contained in this last function which is very short. Its job consists of scanning a list of registered binary formats, passing the binprm structure to all of them until one succeeds. If no handler is able to deal with the executable file, the kernel tries to load a new handler via kerneld and scans the list once again. If no binary format is able to run the executable file, the system call returns the ENOEXEC error code (“Exec format error”).
The main problem with this kind of implementation is in keeping Linux compatible with the standard Unix behaviour. That is, any executable text file that begins with #! must be executed by the interpreter it asks for, and any other executable text is run by /bin/sh. The former issue is easily dealt with by a binary format specialized in running interpreter files (fs/binfmt_script.c), and the interpreter itself is run by calling search_binary_handler() once again. This function is designed to be reentrant, and binfmt_script checks against double invocation. The latter issue is mainly an historical relic and is simply ignored by the kernel. The program trying to execute the file takes care of it. Such a program is usually your shell or make. It's interesting to note that while recent versions of gmake execute properly when a script has no leading #! line, previous versions didn't call a shell resulting in a “cannot execute binary file” message when running unadorned scripts from within a Makefile.
All the magic handling of data structures needed to replace the old executable image with the new one is performed by the specific binary loader, based on utility functions exported by the kernel. If you would like to take a look at such code, the function load_out_binary() in fs/binfmt_aout.c is a good place to start—easier than the ELF loader.
The implementation of exec is interesting code, but Linux has more to offer: registration of new binary formats at run time. The implementation is quite straightforward, although it involves mucking with rather elaborate data structures—either the code or the data structures must accomodate the underlying complexities; elaborate data structures offer more flexibility than elaborate code.
The core of a binary format is represented in the kernel by a structure called struct<\!s>linux_binfmt, which is declared in the linux/binfmts.h file as follows:
struct linux_binfmt {
struct linux_binfmt *next;
long *use_count;
int (*load_binary)(struct linux_binprm *,
struct pt_regs *);
int (*load_shlib)(int fd);
int (*core_dump)(long signr,
struct pt_regs *);
};
The three functions, or “methods”, declared by the binary format are used to execute a program file, to load a shared library and to create a core file. The next pointer is used by search_binary_handler(), while the use_count pointer keeps track of the usage count of modules. Whenever a process p is executing in the realm of a modularized binary format, the kernel keeps track of *(p->binfmt->use_count) to prevent unexpected removal of the module.
A module, then, uses the following functions to load and unload itself:
extern int register_binfmt(struct linux_binfmt *); extern int unregister_binfmt(struct linux_binfmt *);
The functions receive a single argument instead of the usual pointer,name pair because no file in the /proc directory lists the available binary formats. The typical code for loading and unloading a binary format, therefore, is as simple as the following:
int init_module (void) {
return register_binfmt(&bluff_format);
}
void cleanup_module(void) {
unregister_binfmt(&bluff_format);
}
The previous lines belong to the
bluff module (Binary Loader for an
Ultimately Fallacious Format), whose source is available for public
download from ftp://ftp.linuxjournal.com/pub/lj/listings/issue45/2568.tgz.
The structure representing the binary format can declare as NULL any of the functions it offers; NULL functions are simply ignored by the kernel. The easiest binary format, therefore, looks like the following, which is the one used by the bluff module:
struct linux_binfmt bluff_format = {
NULL, &mod_use_count_, /* next, count */
NULL, NULL, NULL /* bin, lib, core */
};
Yes, bluff is a bluff; you can load and unload it at will, but it does absolutely nothing.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- RSS Feeds
- New Products
- Trying to Tame the Tablet
- What's the tweeting protocol?
- Dart: a New Web Programming Experience
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




18 min 22 sec ago
4 hours 57 min ago
7 hours 20 min ago
1 day 8 min ago
1 day 2 hours ago
1 day 3 hours ago
1 day 4 hours ago
1 day 4 hours ago
1 day 9 hours ago
1 day 10 hours ago