Dynamic Load-Balancing DNS: dlbDNS
The rapid growth of computer literacy has led to a dramatic rise in the number of people using computers today. This rise has resulted in the development of intense computation-oriented and resource-sharing applications. These factors together play a prominent role in increasing the load across the Internet, causing severe network traffic congestion. This phenomenon, though dynamic in nature, causes a lot of user frustration in the form of slow response times and repeated crashing of applications.
Developing servers with more capacity and capability of handling this traffic is one way to solve the problem; another is to distribute client requests across multiple servers. This second method is an elegant way of handling this problem, since it uses existing resources and avoids scenarios in which some servers are overloaded while the rest of them are idle. The need for distributing requests across servers is further strengthened, considering:
Each TCP session eats up 32 bytes of memory (a general rule of thumb), causing a server that has 32MB of RAM to theoretically support one million simultaneous connections (see Resources 2).
Given a number of servers, users always log in to their favorite server while overlooking the load on that server.
Distributing a request across servers can be implemented by monitoring the servers regularly and directing the request dynamically to the best server. This way of dynamically directing a request across multiple servers based on the server load is called dynamic load balancing. This feature can be added to the pre-existing Domain Name Service (DNS), as it already plays a prominent role in resolving client requests and can be configured to direct client requests across multiple servers in an effort to avoid network traffic congestion. Here, best server refers to the server with the best rating based on a rating algorithm to be explained later.
We will explain the design, implementation and benefits of a dynamic load-balancing DNS, dlbDNS, which extends DNS.
Four load-balancing models are available. First, RFC 1794 (see Resources 1) describes a load-balancing method using a special zone transfer agent that obtains its information from external sources. The new zone then gets loaded by the name server. One problem with this method is that between zone transfers, the weighted information is essentially static or possibly handed out in a round-robin fashion. This method also doesn't allow a virtual/dynamic domain where a response is created dynamically based on the name being queried (see Resources 4).
The second model is a dedicated load-balancing server which intercepts incoming requests and directs them to the best server. This design employs virtual IP addresses for internal use by the load-balancing server. One problem with this is it adds another server to the existing cluster of servers to be monitored, instead of utilizing the available resources.
A third model is a remote monitoring system that monitors the performance of different servers and provides feedback to the DNS. This design helps detect problems not visible internally, and provides truer access time measurements and easy detection of configuration errors that affect external users. The major problem here is the dependency on the remote network to monitor and deliver data (see Resources 5).
Last is an internal monitoring system that monitors the performance of the servers and provides feedback to the DNS. Its major advantages are easy maintainability and administration, closeness to the source of addressable problems and no security hazards (see Resources 5). This design is implemented in dlbDNS.
Initially, load-balancing was intended to permit DNS agents to support the concept of machine clusters (derived from the VMS usage) where all machines were functionally similar or the same. It didn't particularly matter which machine was picked, as long as the processing load was reasonably well-distributed across a series of actual different hosts. With servers of different configurations and capacities, there is a need for more sophisticated algorithms (see Resources 1).
“Round-robin algorithm A” can distribute requests in a round-robin fashion evenly across servers. Although the requests are handled dynamically, the problem is the total ignorance of various performance characteristics.
“Load-average algorithm A” can distribute requests across servers based on the server load. This design is very simple and fairly inexpensive, but fails miserably if servers vary in configuration and potential.
“Rating algorithm A” is based on the number of users and load-average shown below. This algorithm is reasonable, as its rating favors hosts with the smallest number of unique logins and lower load averages (see Resources 4). This rating algorithm is implemented in dlbDNS to determine the best server.
WT_PER_USER = 100 USER_PER_LOAD_UNIT = 3 FUDGE = (TOT_USER - UNIQ_USER) * (WT_PER_USER/5) WEIGHT = (UNIQ_USER * WT_PER_USER) + (USER_PER_LOAD_UNIT * LOAD) + FUDGE
where the variables are
TOT_USER: total number of users logged in
UNIQ_USERS: unique number of users logged in
LOAD: load average over the last minute, multiplied by 100
WT_PER_USER: pseudo-weight per user
FUDGE: fudge factor for users logged in more than once
WEIGHT: rating of the server
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?