Playing with Binary Formats
To turn the theory into sound practice, let's try to expand our bluff into bloom (Binary Loader for Outrageously Ostentatious Modules). The complete source of the new module is distributed together with bluff.
The role of bloom is to display executable images. Give execution permission to your GIF images and load the module, then call your image like it was a command, and xv will display it.
This code is neither particularly original (most of it comes from binfmt_script.c) nor particularly smart (text-only people like me would rather use an ASCII viewer, for instance, and other people prefer a different viewer). I feel this kind of example is quite didactic anyway, and it can be easily run by anyone who can run an X server and has root access to the computer in order to load modules.
The source file is made up of little more than 50 lines and is able to execute GIF, TIFF and the various PBM formats; needless to say, you must give your images execute permissions (chmod +x) in advance. The viewer is configurable at load time and defaults to /usr/X11R6/bin/xv. Here is a sample session copied from my text console:
# insmod bloom.o # ./snowy.tif xv: Can't open display # rmmod bloom # insmod bloom.o viewer="/bin/cat" # ./snowy.tif | wc -c 1067564
If you use the default viewer and work within a graphic session, your image file will bloom on the display.
If you can't wait to download the source file, you can see the interesting part of bloom in Listing 1. Note that bloom.c falls under the GPL, because most of its code is copied from binfmt_script.c.
The next question that I hear you ask is “How can I set up things so that kerneld can automatically load my module?”
Well, actually it isn't always possible. The code in fs/exec.c only tries to use kerneld when at least one of the first four bytes is not printable. This behaviour is meant to avoid losing too much time with kerneld when the file being executed is a text file without the #! line. While real binary formats have one non-printable byte in the first four, this isn't always true for generic data types.
The net result of this behaviour is that you can't automatically load the bloom viewer when invoking a GIF file or when calling a PBM file by name. Both formats begin with a text string and will therefore be ignored by the auto-loader.
When, on the other hand, the file has a non-printing character within the first four, the kernel issues a kerneld request for binfmt-number, where the exact string is generated by this statement:
sprintf(modname, "binfmt-%hd", *(short*)(&bprm->buf));
The ID of the binary format generated by the above statement represents the first two bytes of the disk file. If you try to execute TIFF files, kerneld looks for binfmt-19789 or binfmt-18761. A gzipped file calls for binfmt--29921 (negative). GIF files, on the other hand, are passed to /bin/sh shell due to their leading text string. If you want to know the number associated with each binary format, look in the /usr/lib/magic file and convert the values to decimal. Alternatively, you can pass the debug argument to kerneld and look at its messages when you execute your data files and it tries to load the corresponding binary format.
It's interesting to note that kernel versions 2.1.23 and newer switched to an easier and more significant ID by using the following line:
sprintf(modname, "binfmt-%04x", *(unsigned short *)(&bprm->buf));
This new ID string represents the third and fourth byte of the binary file and is hexadecimal instead of decimal (thus leading to strings with a better format and no ugly “minus-minus” appearing now and then.
While calling images by name can be funny, it has no real role in a computer system. I personally prefer calling my viewer by name, and I do not believe in the object-orientedness of the approach. This kind of feature in my opinion is best suited to the file manager where it can be tailored by appropriate configuration files without introducing kernel bloat to lie in the way of any computational path.
What is really interesting about binary formats is the ability to run program files that don't fall in the handy #! notation. This includes executable files belonging to other operating systems or platforms, as well as interpreted languages that have not been designed for the Unix operating system—all those languages that complain about a #! in the first line.
If you want to play one such game, you can try the fail module. This “Format for Automatically Interpreting Lisp” is a wrapper to invoke Emacs any time a byte-compiled e-lisp program is invoked by name. Such practice is definitely failure-prone, as it makes little sense to invoke several megabytes of program code to run a few lines of lisp. Moreover, Emacs-lisp is not suited to command-line handling. Together with fail you'll also find a pair of sample lisp executables to make your tests.
A real-world Linux system is full of interesting examples of interpreted binary formats such as the Java binary format. Other examples are the binary format that allows the Alpha platform to run Linux-x86 binaries and the one included in recent DOSEMU distributions that is able to run old DOS programs transparently (although the program must be specifically tailored in advance).
Version 2.1.43 of the kernel and newer ones include generic support for interpreted binary formats. binfmt_misc is somewhat like bloom but much more powerful. You can add new interpreted binary formats to the module by writing the relevant information to the file /proc/sys/fs/binfmt_misc.
Listing 1 and all other programs referred to in this article are available by anonymous download in the file ftp.linuxjournal.com/pub/lj/listings/issue45/2568.tgz.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- Designing Electronics with Linux
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Build a Skype Server for Your Home Phone System
- Validate an E-Mail Address with PHP, the Right Way
- Why Python?
- A Topic for Discussion - Open Source Feature-Richness?
- Tech Tip: Really Simple HTTP Server with Python
1 hour 10 min ago
- Reply to comment | Linux Journal
1 hour 18 min ago
- Understanding the Linux Kernel
3 hours 32 min ago
6 hours 2 min ago
- Kernel Problem
16 hours 5 min ago
- BASH script to log IPs on public web server
20 hours 32 min ago
1 day 8 min ago
- Reply to comment | Linux Journal
1 day 40 min ago
- All the articles you talked
1 day 3 hours ago
- All the articles you talked
1 day 3 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?