Skip to content

Load balancing

A load balancer is a software or hardware device that keeps any one server from getting overloaded with requests by distributing the requests across multiple servers.

Used to increase the capacity and reliability of applications. Session management can improve the performance of applications by reducing the load on individual servers.

Also, can provide increased scalability, security and improved user experience.

There are two primary approaches to load balancing:

  • Dynamic

    uses algorithms that take into account the current state of each server and

  • Static

    distributes traffic w/o making these adjustments, some equal amount of traffic to each server group

Dynamic load balancing algorithms

  • Least connection:

    Checks which servers have the fewest connections open at the time and sends traffic to those servers.

    This assumes all connections require roughly equal processing power.

  • Weighted least connection:

    Gives administrators the ability to assign different weights to each server, assuming that some servers can handle more connections than others.

  • Weighted response time

    Averages the response time of each server, and combines that with the number of connections each server has open to determine where to send traffic.

    By sending traffic to the servers with the quickest response time, the algorithm ensures faster service for users.

  • Resource-based

    Distributes load based on what resources each server has available at the time.

    Specialized software (called an "agent") running on each server measures that server's available CPU and memory, and the load balancer queries the agent before distributing traffic to that server.

Static load balancing algorithms

  • Round robin

    Sends traffic to each server in turn, regardless of the current load on each server.

  • Weighted round robin

    Gives administrators the ability to assign different weights to each server, assuming that some servers can handle more connections than others.

  • IP hash

    fn(ip) -> hash -> server

    Based on the hash, the connection is assigned to a specific server.

Layers

Typically, load balancers are implemented in the following layers:

  • L4 - transport

Transport layer, distributes requests based on the network variables like IP address and destination port.

Performing network addressing translations w/o inspecting the content of the packets.

Used when you don't need to make decisions based on the content of the packets.

E.g: video streaming, voice calls, DNS, etc.

  • L7 - application

Application layer, distributes requests based on data found in application layer protocols such as HTTP.

Can further distribute requests based on URLs, cookies, headers, etc.

Used when you need to make decisions based on the content of the packets.

Reacher set of algorithms, but more expensive (slower than L4).

  • GSLB - global server load balancing

Distributes requests across multiple data centers, which may be in different geographic locations.

GSLB can be used to route requests to the closest data center to the user, or to balance the load across multiple data centers.

Resources