System Minimization

Strategies for reducing Linux's footprint, leaving more resources for the application or letting engineers further reduce the hardware cost of the device.
Root Filesystem

For many embedded engineers new to Linux, the notion of a root filesystem on an embedded device is a foreign concept. Embedded solutions before Linux worked by linking the application code directly into the kernel. Because Linux has a well-defined separation between the kernel and root filesystem, the work on minimizing the system doesn't end with making the kernel small. Before optimization, the size of the root filesystem dwarfs that of the kernel; however, in the Linux tradition, this part of the system has many knobs to turn to reduce the size of this component.

The first question to answer is “Do I need a root filesystem at all?” In short, yes. At the end of the kernel's startup process, it looks for a root filesystem, which it mounts and runs the first process (usually init; doing ps aux | head -2 will tell you what it is on your system). In the absence of either the root filesystem or the initial program, the kernel panics and stops running.

The smallest root filesystem can be one file: the application for the device. In this case, the init kernel parameter points to a file and that is the first (and only) process in userland. So long as that process is running, the system will work just fine. However, if the program exits for any reason, the kernel will panic, stop running, and the device will require a reboot. For that reason alone, even the most space-constrained systems opt for an init program. For a very small overhead, init includes the code to respawn a process that dies, preventing a kernel panic in the event of an application crash.

Most Linux systems are more complex, including several executable files and frequently shared libraries containing code shared by applications running on the device. For these filesystems, several options exist to reduce the size of the RFS greatly.

Change the C Library

Combined with GCC, most users don't think of the C library as a separate entity. The C language contains only 32 keywords (give or take a few), so most of the bytes in a C program are those from the standard library. The canonical C library, glibc, has been designed for compatibility, internationalization and platform support rather than size. However, several alternatives exist that have been engineered from inception to be small:

  • uClibc: this project started as an implementation of the C library for processors without a memory management unit (MMU-less). uClibc was created from the beginning to be small while supplying the same functionality of glibc, by dropping features like internationalization, wide character support and binary compatibility. Furthermore, uClibc's configuration utility gives users great freedom in selecting what code goes into the library, allowing users to reduce the size further.

  • uClibc++: for those using C++, this library is implemented under the same design principles. With support for most of the C++ standard library, engineers easily can deploy C++-based applications onboard with only a few megabytes.

  • Newlib: Newlib grew out of Red Hat's foray into the embedded market. Newlib has a very complete implementation of the math library and therefore finds favor with users doing control or measurement applications.

  • dietlibc: still the smallest of the bunch, dietlibc is the best kept secret among replacements for glibc. Extremely small, 70K small in fact, dietlibc manages to be small by dropping features, such as dynamically linked libraries. It has excellent support for ARM and MIPS.

Using an Alternate C Library

Both Newlib and dietlibc work by providing a wrapper script that invokes the compiler with the proper set of parameters to ignore the regular C libraries included with the compiler and instead use the ones specified. uClibc is a little different as it requires that the toolchain be built from source, supplying tools to do the job in the buildroot project.

Once you know how to invoke GCC so it uses the right compiler, the next step is updating the makefiles or build scripts for the project. In most cases, the build for the project resides in a makefile with a line that looks like this:

CC=CROSS_COMPILE-gcc 

In this case, all the user needs to do is run make and override the CC variable from the command line:

make CC=dietc 

This results in the makefile invoking diet for the C compiler. Although it's tempting, don't add parameters into this macro; instead, use the CFLAGS variable. For example:

make CC="gcc -Os"

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix