Live Patching Requires Reproducible Builds – and Containers Are the Answer

Live patching a key threat management tool

We know that live patching has real benefits because it significantly reduces the downtime associated with frequent patching. But live patching is relatively difficult to achieve without causing other problems and for that reason live patching is not implemented as frequently as it could be. After all, the last thing sysadmins want is a live patch that crashes a system.

Reproducible builds are one of the tools that can help developers to implement live patching consistently and safely. In this article, I explain why reproducible builds matter for live patching, what exactly reproducible builds are, and how containers are coming to the rescue.

Live patching: a key threat management tool

Patching is a critical part of systems maintenance because patching fixes faulty and buggy code. More importantly, security teams rely on patching to plug security holes, and there is a real urgency to it. Waiting for a convenient maintenance window to patch is risky because it leaves an opportunity for hackers to take advantage of an exploit.

It creates a difficult conundrum: maintain high availability but run a security risk, or patch frequently but end up with frustrated stakeholders. Live patching bridges that gap. With live patching, the offending code is swapped out while a process is actively running, without restarting the application or service that depends on that process.

Implementing live patching isn’t easy

Live patching is not that straightforward to accomplish – the drop-in code must “fit” in a like-for-like manner, or all sorts of unwanted things can happen. Get it wrong, and the application – or entire server – will crash.

The code behind a running process usually comes from a binary executable file – a machine-readable block of code compiled from source code. A kernel, for example, has thousands of source files all compiled into a few binaries.

With live patching, the live patch code must fit in at an exact level. Yes, the binary file containing the patch code will be different from the binary file containing the bad code. Nonetheless, the new code must slot into place precisely and must depend on the same version of imported libraries. The live patch code must also be compiled using the same compiler options and flags. Bit endianness matters too – the binary file must be ordered in exactly the same way.

In principle, all this is achievable – but in practice, it is a challenge. For example, day-to-day system updates often impact libraries. These libraries could be slightly different, in turn producing binaries that are slightly different when compiling code.

If a patch is created on a system where there is a minute difference in just one library that patch can easily crash the process it is applied to. Or the entire system if it’s a kernel patch.

Why reproducible builds matter for live patching

In other words, the environment in which a patch is developed really matters. When a patch is developed it’s important to recreate the environment of the code that is being patched. Without any instructions to go by, this can be quite challenging.

For open-source teams that support code, it can be more straightforward as these teams should have records noting the original environmental state, and the same goes for software vendors. But sometimes the developers behind software might not release a patch promptly.

Where organizations or any other third party try to create patches internally the process of attempting to replicate the original build environment can be incredibly challenging, time-consuming – and unpredictable, requiring a degree of trial and error.

That’s why software developers have strived to develop a way to more easily recreate the original build environment – and why something called a reproducible build has emerged as a way forward.

Computer Bug ErrorWhat, then, is a reproducible build?

In simple language, a build is reproducible when two independent software developers have the capability to compile human-readable source code into machine-readable code so that the two independent attempts produce a binary file that is bitwise identical.

Over at Reproducible Builds the authors suggest three key criteria for a reproducible build. First, the build process must be deterministic: no variables such as dates or times are recorded, and the output must be generated in the same order – a reference to bit endianness.

Next, the tools required to reproduce the build environment must be supplied or described in exacting detail. Finally, developers need to be able to validate the build that they produce.

Arguably one of the most important elements is the ability to reproduce the build environment – but it is also one of the most challenging to achieve given the many shifting factors involved in a build environment.

Reproducible builds are not a brand-new concept, the GNU project has been using reproducible builds since the 1990s, but the utility of reproducible builds is perhaps greater than ever before.

Get it right and your reproducible build ensures that you can generate a patch that slots in a live process without breaking anything. You can also rely on the reproducible build process to ensure that the binary code you are using has not been tampered with. In fact, this ensures that a direct path from the source to the output binary file can be verified.

Containers: the Columbus Egg of live patching…

I’ve explained why reproducible builds are a terrific concept, but also why reproducible builds can be challenging to achieve. One route that makes it easier to consistently reproduce the build environment is making use of virtual machines. Developers essentially package up and distribute the entire build environment in a virtual machine.

However, virtual machines are cumbersome to manage – the VM file is large and the entire OS needs to be started up to accommodate a change request, or to see “inside” the VM to analyze the build environment. When there are multiple teams in the mix VMs can be a real hassle.

Containers are a much lighter, more agile way to emulate the original build environment. Compared to VMs, containers deliver OS-level virtualization which means that container images are much more compact.

Take Docker, for example, a popular product that enables containerization. When developers want to emulate the original build environment the developer simply downloads a configuration file for a container – called a Dockerfile. In this configuration file is a set of instructions that reproduces the exact build environment.

Rather than managing large VM files, any changes to the build environment are simply written into the Dockerfile which is stored next to all the other code that is required to reproduce a build. Containers enable developers to sidestep many of the difficulties imposed by reproducible builds while also delivering a build process that can be audited.

Live patching as an everyday practice

The ability to reproduce identical binary packages – reproducible builds – is critical to make live patching work. Yet reproducible builds have been challenging to achieve, which in turn limits the availability of live patching.

Containers make the job of producing patches that are fit for live patching much, much easier. It opens the door for enterprise users to develop patches knowing that these patches will slot into place live – without causing any disruption.

It means that live patching can be deployed more widely – saving scores of organizations from the disruptive effects of patching downtime, and by consequence improving their overall security posture as patching can now be done as fast and as frequently as needed.

Joao Correia is a Technical Evangelist at TuxCare with a long background in System Administration, where he learned the intricacies of keeping enterprise stakeholders happy and systems protected. Now he shares his views on security, open-source and IT best practices on the TuxCare blog at tuxcare.com, where he covers at length the risks and benefits of open source solutions for secure Enterprise IT operations.

Load Disqus comments