Playing with Binary Formats
In order to implement a binary format that is of some use, the programmer must have some background information about the arguments that are passed to the loading function, i.e., format->load_binary. The first such argument contains a description of the binary file and the parameters, and the second is a pointer to the processor registers.
The second argument is only needed by real binary loaders, like the a.out and ELF formats that are part of the Linux kernel sources. When the kernel replaces an executable file with a new one, it must initialize the registers associated with the current process to a sane state. In particular, the instruction pointer must be set to the address where execution of the new program must begin. The function start_thread is exported by the kernel to ease setting up the instruction pointer. In this article I won't go so deep as to describe real loaders but will limit the discussion to “wrapper” binary formats, similar to binfmt_script and binfmt_java.
The linux_binprm structure, on the other hand, must be used even by simple loaders, so it is worth describing here. The structure contains the following fields:
char buf: This buffer holds the first bytes of the executable image. It is usually looked up by each binary format in order to detect the file type. If you are curious about the known magic numbers used to detect the different file formats, you can look in the text file /usr/lib/magic (sometimes called /etc/magic).
unsigned long page[MAX_ARG_PAGES]: This array holds the addresses of data pages used to carry around the environment and the argument list for the new program. The pages are only allocated when they are used; no memory is wasted when the environment and argument lists are small. The macro MAX_ARG_PAGES is declared in the binfmts.h header and is currently set to 32 (128KB, 256KB on the Alpha). If you get the message “Arg list too long” when trying to run a massive grep, then you need to enlarge MAX_ARG_PAGES.
unsigned long p: This is a “pointer” to data kept in the pages just described. Data is pushed to the pages from high addresses to low ones, and p always points to the beginning of such data. Binary formats can use the pointer to play with the initial arguments that are passed to the program being executed, and I'll show such use in the next section. It's interesting to note that p is a pointer to user-space addresses, and it is expressed as unsigned long to avoid an undesired de-reference of its value. When an address represents generic data (or an offset in the memory “array”) the kernel often considers it a long integer.
struct inode *inode: This inode represents the file being executed.
int e_uid, e_gid: These fields are the effective user and group ID of the process executing the program. If the program is set-uid, these fields represent the new values.
int argc, envc: These values represent the number of arguments passed to the new program and the number of environment variables.
char *filename: This is the full pathname of the program being executed. This string lives in kernel space and is the first argument received by the execve system call. Although the user program won't know its full pathname, the information is available to the binary formats, so they can play games with the argument list.
int dont_iput: This flag can be set by the binary format to tell the upper layer that the inode has already been released by the loader.
The structure also contains other fields that are not relevant to the implementation of simple binary formats. What is relevant, on the other hand, are a pair of functions exported by exec.c. The functions are meant to help the job of simple binary loaders such as the ones I'll introduce in this article.
unsigned long copy_strings(int argc,char ** argv, unsigned long *page, unsigned long p, int from_kmem); void remove_arg_zero(struct linux_binprm *bprm);
The first function is in charge of copying argc strings, from the array argv into the pointer p (a user space pointer, usually bprm->p). The strings will be copied before the address pointed to by p (argument strings grow downwards). The original strings, the ones in argv, can either reside in user space or in kernel space, and the array can be in kernel space even if the strings are stored in user space. The from_kmem argument is used to specify whether the original strings and array are both in user space (0), both in kernel space (2) or the strings are in user space and the array in kernel space (1). remove_arg_zero removes the first argument from bprm by incrementing bprm->p.
Practical Task Scheduling Deployment
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.View Now!
|The Firebird Project's Firebird Relational Database||Jul 29, 2016|
|Stunnel Security for Oracle||Jul 28, 2016|
|SUSE LLC's SUSE Manager||Jul 21, 2016|
|My +1 Sword of Productivity||Jul 20, 2016|
|Non-Linux FOSS: Caffeine!||Jul 19, 2016|
|Murat Yener and Onur Dundar's Expert Android Studio (Wrox)||Jul 18, 2016|
- Stunnel Security for Oracle
- The Firebird Project's Firebird Relational Database
- SUSE LLC's SUSE Manager
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Managing Linux Using Puppet
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- Google's SwiftShader Released
- SuperTuxKart 0.9.2 Released
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide