System Minimization

Strategies for reducing Linux's footprint, leaving more resources for the application or letting engineers further reduce the hardware cost of the device.

“How small can you make this?” is a question frequently heard by embedded engineers at the start of their projects. Most of the time, the person asking this question is concerned with reducing the RAM and Flash resources with the goal of reducing a device's unit costs or energy requirements.

Because Linux, and the surrounding environment, originally was intended for desktop or server systems, its default configuration isn't optimized for size. However, as Linux is finding itself in more embedded devices, making Linux “small” isn't as daunting a task as it once was. There are several different approaches for reducing the memory footprint of a system.

Many engineers start by reducing the size of the kernel; however, there is lower-hanging fruit at hand. This article goes into detail about how to reduce the size of the kernel, mostly by removing code that won't even be used in a typical embedded system.

A root filesystem (RFS) can be the largest consumer of memory resources in a system. A root filesystem contains the infrastructure code used by an application as well as the C library. Selecting the filesystem used for the RFS itself can have a large effect on the final size. The standard, ext3, is frightfully inefficient on several axes from an embedded engineer's perspective, but that's a topic for another article.

Realistically, How Small?

Even the smallest Linux distribution has at least two parts: a kernel and root filesystem. Sometimes, these components are colocated in the same file, but they're still separate and distinct components. By removing nearly all features from the kernel (networking, error logging and support for most devices) and making the root filesystem just the application, the size of a system easily can be less than 1MB. However, many users choose Linux for the networking and device support, so this isn't a realistic scenario.

Kernel

The Linux kernel is interesting in that although it depends on GCC during compilation time, it has no dependencies at runtime. Those engineers new to Linux confuse the initial RAM disk (so-called initrd) with a kernel runtime dependency. The initrd is mounted first by the kernel, and a program runs that interrogates the system in order to figure out what modules need to be loaded in order to support the devices, so that the “real” root filesystem can be mounted. In fact, the two-step mounting, the initrd followed by the real root filesystem, rarely finds its way into embedded systems as the gain in flexibility in a system that does change isn't worth the additional space or time. But, this topic falls under the rubric of the root filesystem and is discussed later in this article.

Most of the effort in reducing kernel size lies in removing what's not needed. Because the kernel is configured for desktop and server systems, it has many features enabled that wouldn't be used in an embedded system.

Loadable Module Support

Kernel loadable modules are re-locatable code that the kernel links into itself at runtime. The typical use cases for loadable modules are allowing drivers to be loaded into the kernel from user space (typically after some probing process) and allowing the upgrade of device drivers without taking down the system. For most embedded systems, once they're out in the field, changing the root filesystem is either impractical or impossible, so the system's designer links the modules directly into the kernel, removing the need for loadable modules. The space-saving in this area isn't limited to the kernel, however, as the programs managing loadable modules (such as insmod, rmmod and lsmod) and the shell script to load them aren't necessary.

Linux-tiny Patches

The Linux-tiny set of patches has been an on-again-off-again project that originally was spearheaded by Matt Mackall. The Consumer Electronics Linux Forum (CELF) has put effort into reviving the project, and the CELF Developer's Wiki has patches for the 2.6.22.5 kernel (at the time of this writing). In the meantime, many of the changes in the Linux-tiny Project have been included in the mainline kernel. Even if many of the original Linux-tiny patches have made it into the kernel, some substantial space-saving patches haven't, such as:

  1. Fine-grain printk support: users can have control over what files can use printk. This allows engineers to reap the size benefits of excluding printk for the kernel at large while still having access to their favorite debugger in the places where it's needed most.

  2. Change CRC from calculation to use table lookup: Ethernet packets require a CRC to validate the integrity of the packet. This implementation of the CRC algorithm uses table lookups instead of calculations, saving about 2K.

  3. Network tweaking: several patches reduce the supported network protocols, buffer sizes and open sockets. Many embedded devices support only a few protocols and don't need to service thousands of connections.

  4. No panic reporting: if the device has three status lights and a serial connection, the user won't be able to see, much less act on, panic information that appears on a (nonexistent console). If the device has a kernel panic failure, the user simply will power-cycle the device.

  5. Reduction of inlining: an inline is where the compiler, instead of generating a call to a function, treats it as a macro, putting a copy of the code in each place it is called. Although the inline directive is technically a hint, GCC will inline any function by default. By suppressing inline functions, the code runs slightly slower, as the compiler needs to generate code for a call and return; in exchange, however, the object file is smaller.

The Linux-tiny patches are distributed in a tar archive that can be applied with the quilt utility or applied individually.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState