A high-volume site like Yahoo! knows that the actual quality of service any web server provides to end users basically depends on network-transfer speed and server response time. Network-transfer speed refers to the Internet-link bandwidth while server-response time depends upon resources including fast CPU, lots of RAM and good I/O performance. Once these resources are exhausted and the web-server is encountering heavy traffic, a
problem would surely arise.
A problematic situation pertaining to difficulty in handling high volumes of incoming traffic can be solved either through installing more RAM on existing machines or replacing the CPU with a faster one. The use of faster or dedicated SCSI controllers and disks with shorter access time can also be done. Software can be tuned so that the operating system parameters and web server software can be adjusted to achieve better performance.
An alternative approach is to improve performance by increasing the number of web servers. This approach would attempt to distribute traffic unto a cluster of back-end web servers that need not be large-scale machines. Web server scalability is achieved when more servers are added to distribute the load among the group of servers or server cluster.
This is what load balancing is all about. It involves the fine tuning of a computer system, network or disk subsystem in order to more evenly distribute the data and/or processing across available resources. Load balancing is distributing, processing and communications activity evenly across a computer network so that no single device is overwhelmed. Busy websites usually use two or more web servers in a load balancing scheme so that when one server gets overwhelmed with requests, traffic is forwarded to another server with more capacity.
There are two probable reasons why a company could want to load balance traffic across firewalls. One is for purely technical reasons and the other is centered on winning business. The technical aspect should be quickly addressed as soon as funds and environment allow.
When there is only one web server responding to all incoming HTTP requests for a website, it may not be able to perform accordingly especially if the website has gained popularity. Loading of web pages will be very slow and some users would have to wait for their requests to be processed. It can come to a point where upgrading the server hardware is no longer cost effective due to the increased traffic and connections to a website.
Yahoo! was granted a patent from a filing done in 1999 regarding coordinating information between multiple servers that share information as well as servers that may cache some of the information. Load balancing devices are becoming very common in supporting high-traffic websites. These devices evolve as websites grow in terms of size, complexity and traffic flow.
The presence of multiple web servers in a server group requires that HTTP traffic be evenly distributed among the servers. These servers should appear as a single web server to the web client. The load balancer simply intercepts each request and redirects it to an available server in the server cluster.
Methods of Load Balancing
Load balancing can be achieved in a number of ways. Choice would depend on the individual requirement, available features, complexity of implementation and the cost. The user company would have to determine its circumstances to determine which option would work best.
The Round Robin DNS Load Balancing is one of the early adapted load balancing techniques. The built-in round robin feature of BIND of a DNS server facilitates cycling through the IP addresses corresponding to a group of servers in a cluster. It is a fairly simple and inexpensive method which is very easy to implement. However, its downside is that the DNS server does not have any knowledge of server availability thus may continually point to an unavailable server. It has the ability to differentiate by IP address but not by server port. There is also the possibility that the IP address is cached by other name servers which would result to request not being sent to the load balancing DNS server.
In Hardware Load Balancing, hardware load balances route TCP/IP packets to various servers in a cluster. This method is said to provide a powerful topology with high availability. It uses circuit level network gateway to route traffic. Its one downside is the higher cost incurred as compared to other methods.
The most commonly used method is Software Load Balancing. Load balancers often come as an integrated component of expensive web server and application server software packages. This method is more configurable based on requirements and can incorporate intelligent routing base on multiple input parameters. An additional hardware needs to be provided to isolate the load balancers.
Algorithm of Server Load Balancing
When HTTP requests are assigned to any server picked randomly among the group of servers, this is called random allocation. It is possible that one server may be assigned more requests than the others, but generally each server gets its share of the load. It can be very easy to implement but the risk of overloading one while under-utilizing another is big.
The IP sprayer assigns the requests to a list of the servers on a rotating basis when the round-robin allocation is used. The first request goes to a randomly picked server in a group so that the entire first request need not go to the same server especially if more than one IP sprayer is involved. The circular order is followed in redirecting the traffic for subsequent requests. The server which has been assigned a request moves to the end of the list to ensure that all servers are equally assigned. The allocation is much orderly than random but it may not be enough based on processing overhead required and when there are differences in server specification in a server group.
The shortcoming of the round-robin allocation has been eliminated by the weighted round-robin version. In this case, a server that is capable of handling twice as much load as the other can get a weight of two. This means that the IP sprayer will assign two requests to the powerful server as against one request assigned to the weaker one. This takes care of the capacity of the servers in the group. However, it does not consider the advanced load balancing requirements like processing time for individual request. An efficient load balancer should be capable of intelligent monitoring that would help it direct requests to the server that is more capable of handling them.