Linux on Carrier Grade Web Servers
ARIES (Advanced Research on Internet E-Servers) is a project that started at Ericsson Research Canada in January 2000. It aimed at finding and prototyping the necessary technology to prove the feasibility of a clustered internet server that demonstrates telecom-grade characteristics using Linux and open-source software as the base technology. These characteristics feature guaranteed continuous availability, guaranteed response time, high scalability and high performance.
Traffic distribution was one of the main research topics because we needed a software-based solution that was very reliable, high performance and yet very scalable to distribute web traffic among multiple CPUs within the cluster. The first activity we conducted was to survey the Open Source community and see what solutions were already implemented, try them out on our experimental Linux cluster and see which solution best met our requirements (or part of them).
This article covers our experience with the Linux Virtual Server (LVS), a software package that provides traffic distribution on top of Linux. We will explain the architecture, operation and algorithms of LVS and describe our test system and benchmarking environment as well as the experiments we conducted to test LVS's robustness and scalability.
The traffic on the Internet is growing at over 100% every six months. Thus, the workload on the servers is increasing rapidly, and these servers are easily overloaded.
To overcome this problem there are several solutions. The most interesting one is the multiserver solution that consists of building a scalable server on a cluster of servers. When the load increases, more servers can be added into the cluster to meet the increasing requests. The multiserver architecture provides good flexibility (adding and removing nodes) and availability since there is no downtime associated with it. A second advantage is that the real servers do not need to be homogeneous, which allows for the recycling of some old workstations. A third advantage is transparency: the cluster is presented to the users as a single IP that maps requests to multiple servers at the back end.
The Linux Virtual Server follows the multiserver model and provides a software-based solution for traffic distribution as opposed to hardware-based solutions.
Our research in the ARIES Project was directed toward a multiserver architecture that is able to achieve linear scalability reflected through a continuous growth to meet increasing demands, continuous service availability achieved through building redundancy at all levels of the architecture and ease and completeness of management without affecting the uptime of the system.
We were naturally interested in LVS as an HTTP traffic distribution solution because the architecture of the cluster will be transparent to end users, and thus the whole system would appear with a single IP address. Furthermore, LVS claims to be highly available by detecting node or dæmon failures and reconfiguring the system appropriately so that the workload can be taken over by the remaining nodes in the cluster. This is a very important feature for systems geared toward high availability.
The LVS Project is an open-source project to cluster many servers together into a highly available and high-performance virtual server that provides good scalability, reliability and serviceability. The LVS director provides IP-level load balancing to make parallel services of the cluster appear as a virtual service on a single IP address.
The LVS Project is currently cooperating with the High Availability Linux Project that aims at providing a high-availability (clustering) solution for Linux, which promotes reliability, availability and serviceability (RAS) through a community-development effort.
In an LVS setup, the real servers may be interconnected by a high-speed LAN or by geographically dispersed WAN. On the other hand, the front end of the real servers, called director, is a load balancer that distributes requests to the different servers. All requests are sent to the front-end director with the virtual IP address, and the cluster appears as a virtual service on a single IP address.
This architecture is flexible because it allows transparently adding or removing real server nodes, and it is geared toward high availability by automatically detecting nodes or dæmons failures and reconfiguring the system appropriately. For added availability, we can setup a second director as a hot swap for the primary director, thus eliminating a possible single point of failure.
LVS is implemented in three IP load-balancing techniques. One is virtual server via network address translation (NAT), the second is virtual server via IP tunneling and the third is virtual server via direct routing. When we conducted this activity, the NAT implementation was very stable compared to the others. Therefore, we decided to setup LVS using NAT.
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- Non-Linux FOSS: libnotify, OS X Style
- UX Designer
- Android's Limits
- Reachli - Amplifying your
1 hour 9 min ago
1 hour 58 min ago
- good point!
2 hours 1 min ago
- Varnish works!
2 hours 10 min ago
- Reply to comment | Linux Journal
2 hours 40 min ago
- Reply to comment | Linux Journal
5 hours 6 min ago
- Reply to comment | Linux Journal
9 hours 5 min ago
- Yeah, user namespaces are
10 hours 22 min ago
- Cari Uang
13 hours 53 min ago
- user namespaces
16 hours 46 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?