Critical Server Needs and the Linux Kernel

A discussion of four of the kernel features needed for mission-critical server environments, including telecom.
Run-Time Authenticity Verification for Binaries

The Distributed Security Infrastructure (DSI) is an open-source project started at Ericsson to provide a secure framework for carrier-grade Linux clusters that run soft real-time distributed applications. Carrier-grade clusters have tight restrictions on performance and response time, making the design of security solutions difficult. Many security solutions cannot be used due to their high-resource consumption. Therefore, the need for a security framework that targets carrier-grade Linux clusters was important to provide advanced security levels in such systems.

Linux generally has been considered immune to the spread of viruses, backdoors and Trojan programs on the Internet. However, with the increasing popularity of Linux as a desktop platform, the risk of seeing viruses or Trojans developed for this platform are growing. One way of solving this potential risk is to allow the system to prevent, at run time, the execution of untrusted software.

One solution is to sign digitally the trusted binaries and have the system check the digital signature of binaries before running them. Therefore, untrusted (unsigned) binaries are denied the execution. This can improve the security of the system by avoiding a wide range of malicious binaries from running on the system.

Figure 3. bsign's Signature Section as Added in an ELF Binary

Figure 4. DigSig in Action

DigSig, a component of DSI, is one implementation of such a feature. DigSig is a Linux kernel module that checks the signature of a binary before running it. DigSig inserts digital signatures inside the ELF binary and verifies this signature before loading the binary. It is based on the Linux security module (LSM) hooks. LSM has been integrated with the Linux kernel since 2.5.x and higher.

Typically, in this approach, vendors do not sign binaries; the control of the system remains with the local administrator. The responsible administrator is to sign all binaries she trusts with her private key. Therefore, DigSig guarantees two things. First, if you signed a binary, no one else can modify that binary without being detected. Second, nobody can run a binary that is not signed or is signed badly.

Several initiatives in this domain already have been made, such as Tripwire, bsign and Cryptomark, but we believe the DigSig project is the first to be easily accessible to all--its available on SourceForge under the GPL license--and to operate at the kernel level at run time. Run time is particularly important for carrier-grade environments, as it takes into account the high availability aspects of the system.

The DigSig approach has been using extant solutions such as GnuPG and bsign rather than reinventing the wheel. However, in order to reduce the overhead in the kernel, the DigSig project only took the minimum code necessary from GnuPG. This helped to reduce the amount of code imported to the kernel; only one-tenth of the original GnuPG 1.2.2 source code has been imported to the kernel module.

DigSig is a contribution from Ericsson to the Open Source community under the GPL license. DigSig has been announced on LKML; however, it is not yet integrated in the Linux kernel.

An Efficient Low-Level Asynchronous Event Mechanism

Operating systems for carrier-grade systems must be able to deliver a high response rate with minimum downtime. In addition, carrier grade systems must take into account characteristics such as scalability, high availability and performance.

In carrier-grade systems, thousands of requests must be handled concurrently without affecting the overall system's performance, even under extremely high loads. Subscribers expect some latency time when issuing a request, but they are not willing to accept an unbounded response time. Such transactions are not handled instantaneously for many reasons, and it can take some milliseconds or seconds to reply. Waiting for an answer reduces applications' abilities to handle other transactions.

Many different solutions have been proposed and prototyped to improve the Linux kernel capabilities in this area. Most have focused on using different types of software organization, such as multithreaded architectures, implementing efficient POSIX interfaces or improving the scalability of existing kernel routines.

One possible solution appropriate for carrier-grade servers is the asynchronous event mechanism (AEM). AEM provides an asynchronous execution of processes in the Linux kernel. It implements native support for asynchronous events in the Linux kernel and aims to bring carrier grade characteristics to Linux in areas of scalability, performance and soft real-time responsiveness.

An event-based mechanism provides a new programming model that offers software developers unique and powerful support for asynchronous execution of processes. Of course, it differs radically from the sequential programming styles used, but it offers a design framework better structured for software development. It also simplifies the integration and the interoperability of complex software components. In addition, AEM offers an event-based development framework, scalability, flexibility and extensibility.

The emerging paradigm of AEM provides a simpler and more natural programming style when compared to the complexity offered by multithreaded architectures. It proves its efficiency for the development of multilayer software architectures, where each layer provides a service to the upper layer. This type of architecture is quite common for distributed applications. One of the strengths of AEM is its ability to combine synchronous and asynchronous code in the same application, or even mix these two types of models within the same code routine. With this hybrid approach, it is possible to take advantage of their respective capabilities, depending on the situation. This model is favorable especially for the development of secure software and for the long-term maintenance of mission-critical applications.

Ericsson released AEM to the Open Source community in February 2003 under the GPL license. AEM was announced on LKML and received a lot feedback. The feedback suggested changes to the design, which resulted in an improved implementation and a better kernel-compliant code structure. AEM is not yet integrated into the Linux kernel.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Critical Server Needs and the Linux Kernel

OscarHinostroza's picture

The next generation Linux Server Based Telecom

Tank you Ibrahim Haddad

Re: Critical Server Needs and the Linux Kernel

smurfix's picture

Multi-FIB is already possible, in that you can mimic most of its effects with iptables; the kernel has supported multiple routing tables for some time now.

However, I do question the idea of doing this in the first place. This use case can only arise when two separate customers insist on using overlapping RFC-internal IPv4 address spaces for their servers and you need to put both of them onto one host. Better solutions to this problem exist (remap in the external load balancer, use IPv6, use virtual servers, use separate physical servers, ...).

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix