An Event Mechanism for Linux
This article is the first in a series describing a new event-based mechanism for Linux. This particular article focuses on the motivation, requirements and benefits of such a mechanism for carrier-grade Linux.
The work of supporting a native event-based system in the Linux kernel started as a research project in 2001 at the Open Systems Lab (Ericsson Research, Corporate Unit) in Montréal, Canada. The goal was to provide Linux with an event-driven environment that could deliver better performance compared to existing solutions in the context of telecom applications.
There has been a growing interest in bringing the Linux operating system to a carrier-grade level. For example, the OSDL Carrier-Grade Linux working group (www.osdl.org/projects/cgl) currently is drafting a set of requirements to turn Linux into a solid carrier-grade server for Next Generation Networks.
Operating systems for telecom applications must ensure that they can deliver a high response rate with minimum downtime—less than five minutes per year, 99.999% of uptime—including hardware, operating system and software upgrade. In addition to this goal, a carrier-grade system also must take into account such characteristics as scalability, high availability and performance.
For such systems, thousands of requests must be handled concurrently without impacting the overall system's performance, even under high load. Subscribers can expect some latency time when issuing a request, but they are not willing to accept an unbounded response time. Such transactions are not handled instantaneously for many reasons, and it can take some milliseconds or seconds to reply. Waiting for an answer reduces applications' abilities to handle other transactions.
Many different solutions have been envisaged to improve Linux's capabilities using different types of software organization, such as multithreaded architectures, by implementing efficient POSIX interfaces or by improving the scalability of existing kernel routines. We think that none of these solutions are adequate for true carrier-grade servers.
In order to understand our point, this article provides an overview of telecom networks. The purpose is to clarify the requirements for a carrier-grade operating system. After this introduction, we explain the benefits of a native asynchronous event mechanism to better support carrier-grade characteristics.
Telecommunications is concerned with the establishment of telephone calls between two devices and the transport of voice over wire links. Figure 1 shows the traditional Public Switched Telephone Network (PSTN). More specifically, carriers use the term signaling to indicate the establishment of a telephone call between two devices. Signaling in the PSTN is done through the Signaling System 7 (SS7) protocol stack. SS7 performs call routing and builds a path to the destination telephone through the circuits. The two phones are connected once the path is established and the voice can be carried over. The SS7 protocol is able to handle call routing, call forwarding and error conditions.
The flexibility, cost performance and continuous growth of the Internet are driving a migration of many telecom services to it. This helps to establish IP technologies as the new standard for all communications services. These two types of networks are based on different technologies and require the utilization of signaling and media gateways for inter-working purposes.
Gateways perform the translation of information between different types of network. For example, an SS7 gateway is used to encapsulate signaling over the IP network through the Stream Control Transmission Protocol (SCTP). A media gateway is used to encode and decode the voice coming from the PSTN network and going to the IP network and vice versa.
The signaling and the media gateways provide connectivity to the IP network for calls coming from the PSTN network. Signaling gateways must perform protocol encapsulation in order to carry the syntax and semantics of SS7 messages over an internet protocol, such as SCTP over IP or UDP over IP. Signaling servers must be able to scale with respect to their system capabilities as the number of concurrent requests increases.
With a media gateway, operators can implement transport of streaming data between the PSTN and the IP networks. The number of connections they can accept depends on their hardware, which can vary in size from one to thousands of interfaces. Media gateways must support a large number of connections in real time.
Authentication, authorization and accounting (AAA) servers maintain databases of user profiles. Typically, one or two AAA servers may contain information about several millions of subscribers belonging to a particular network operator. It is common to observe peaks of thousands of concurrent authentication requests per second. Such variation in the number of connections is difficult to plan for in advance. AAA servers have a critical role of controlling access to the IP network and are not allowed to fail. They require soft real-time capabilities, so they can reply to most requests within a few milliseconds.
Media servers provide specialized resources and services to end users, such as video conferences, video servers, applications and e-mail. An important aspect of these carrier-class systems is scalability. These platforms can accept a linear increase of the number of transactions with respect to the number of processors, interfaces or bandwidth. Telecom operators speak of linear scalability, meaning the cost per transaction or per user should not increase when scaling up a server.
In case of failure or unplanned interruption, carrier-grade platforms can recover automatically or failover to another server through network redundancy procedures. Live software upgrades and hot swap of hardware devices also are part of the 99.999% service availability.
Linux has proven to be stable and consistent over the years, and it is already an attractive option for carriers. In order to become a key element of telecom networks, however, it must be enhanced with components providing these much-needed carrier-grade capabilities.
- Geek Guide: The DevOps Toolbox
- Download "The DevOps Toolbox: Tools and Technologies for Scale and Reliability"
- Nmap—Not Just for Evil!
- Resurrecting the Armadillo
- High-Availability Storage with HA-LVM
- March 2015 Issue of Linux Journal: System Administration
- Real-Time Rogue Wireless Access Point Detection with the Raspberry Pi
- DNSMasq, the Pint-Sized Super Dæmon!
- Days Between Dates: the Counting
- Localhost DNS Cache