Key Considerations for Software Updates for Embedded Linux and IoT

Image-based updaters have, in general, a clear preference in the embedded space. The main reason for this is that they typically provide atomicity during the update process. Atomicity means that 1) an update is always either fully applied or not at all, and 2) no other component except the software updater can ever see a partial update. This property is very important for embedded updaters, because embedded devices could lose power or be rebooted at any time, and losing power in the middle of the update process should not lead to the device becoming bricked or unusable. The other stated key advantage of image-based updaters was consistency across devices, meaning that you can more confidently rely on behavior in test environments being the same as in production because of the 1:1 software copy.

Package-based approaches generally suffer from not being able to implement atomic updates, but they have some advantages as well. The installation time of an update is shorter, and the amount of bandwidth used also can be smaller than for image-based updates. Finally, since many develop their homegrown updater and already have packages from their build system, package-based update systems are generally faster to develop from scratch.

The Embedded Environment

People familiar with Linux desktop and server systems might ask why we are not just using the same tools and processes that we know from these systems, including package managers (such as rpm, dpkg), VMs and containers to carry out software updates. To understand this, it is important to see in which aspects an embedded device is different with regards to applying software updates.

Unreliable Power

I already touched on this property of an embedded system, and this is a widely known issue: an embedded device can, in general, lose power at any time. For example, a smart portable audio system can be unplugged as it is moved around in the house. The battery of a portable GPS could run out or become unreliable in certain weather conditions. In-vehicle-infotainment systems in cars can lose power intermittently as the car is started or stopped. This issue is amplified in battery-powered devices, as the update process itself can consume significant battery and cause the power to run out.

Embedded systems must be designed in such a way that they will tolerate power failure at any given time. The lack of power reliability is in strong contrast to typical data-center environments, where multiple redundant power systems will ensure that power is never lost.

Unreliable Network

Embedded devices typically are connected using some kind of wireless technology. Although Wi-Fi is used in some devices, it is more common to use wireless standards that have longer range but lower data rates, for example 3G, LoRa, Sigfox and protocols based on IEEE 802.15.4 (low-rate wireless personal area networks).

It is tempting to assume that high-speed wireless networks will be generally adopted by embedded devices as technology evolves, just like what happened with smartphones where you can now stream YouTube videos in high resolution. However, keep in mind that the use cases for smartphones and typical embedded devices always will be very different. For example, an agricultural device that measures and optimizes crop yield needs a high amount of connectivity and should work even in places where there is no 3G coverage. In addition, the amount of data that needs to be sent is very low—perhaps just a few data points per day on the temperature and moisture measurements of the earth. So one should rather assume that embedded devices, especially industrial ones, always will have a limited network data rate.

In addition, wireless networks have frequent and intermittent connectivity loss—for example, when the device is moved to an area with low coverage, like underground.

Although low data rate and intermittent connectivity can be difficult design issues, they usually are easy to identify once something is not right.

Security issues over public wireless networks are much more implicit and difficult to expose. In the context of software updates, there are countless examples of homegrown updaters that do not properly authenticate the update, allowing an attacker to inject malicious code while the update is taking place.

Expensive Physical Access

Once a large-scale issue that cannot be fixed remotely occurs, the cost of remediating it is typically very high. The reason is that embedded devices are typically widely distributed geographically.

For example, a manufacturer of smart energy grid devices can install these devices in thousands of homes in several countries. If there is a critical issue with an update to the Linux kernel that cannot be fixed remotely, the cost of either sending a service technician to all those homes or asking customers to send devices back to the vendor can be prohibitive.

The 2015 Fiat-Chrysler Jeep Cherokee security breach offers a recent real-world example of wide-scale recalls. In this case, 1.4 million cars were recalled. The cost of repairing this issue was certainly in the hundred-million dollar range, perhaps even billions.

Five-to-Ten-Year Device Lifetime

Technology moves very fast, and it's typical to replace common consumer electronics devices like smartphones and laptops every two to three years.

However, more expensive consumer devices like high-end audio systems and TVs are replaced less frequently. Industrial devices that do not directly interact with humans typically have even longer lifetimes. For example, robots used on factory floors or energy grid devices easily can reach a ten-year lifetime.

In conclusion, in the embedded environment, people need to be very wary of the risk of "bricking" devices. Not only can this easily happen due to the power and network properties, but it is also a very expensive situation from which to recover.

______________________

Eystein Stenberg has more than seven years of experience in security and systems management software and has spoken at various conferences. You can reach him at eystein@mender.io.