A load balancer, also known as an application delivery controller (ADC) is a physical or virtual network device, software, or cloud service that distributes a large volume of incoming traffic from users and applications across multiple applications servers and services.
The objective of a load balancer is to ensure high availability by providing alternate application resources–in case the initial one becomes unavailable–to enable efficient utilization of both network and application server resources, and to reduce the time it takes to respond to user requests.
A load balancer acts as a traffic director sitting in front of your applications, routing client requests across all application servers capable of fulfilling those requests, in a manner that maximizes speed and capacity utilization. This ensures that no one server is overworked, which could degrade performance. If an application server instance goes down, the load balancer redirects traffic to other available application servers. When a new application server instance is added to the server group, the load balancer automatically includes the new application instances in its distribution algorithm.
History of Load Balancing
Load balancing originated in the late 1990s to address performance and reliability issues as web applications grew in complexity and popularity. Early web servers struggled to handle the increasing volume of simultaneous user requests, leading to frequent slowdowns and outages. Simple load-distribution mechanisms, such as round-robin Domain Name System (DNS) routing, emerged first but had limited effectiveness because they didn’t adapt to server availability or capacity.
Cisco introduced the first commercial load balancer, LocalDirector, in 1997. This marked a significant shift to the use of dedicated load balancer appliances, which could dynamically manage traffic across multiple servers, providing more sophisticated methods than DNS routing alone. These hardware solutions included health checks, which ensured traffic was only directed to operational servers, and session persistence, which helped maintain user sessions on a specific server for certain applications.
As virtualization gained traction in the mid-2000s, software-based load balancers became popular, offering greater flexibility and cost efficiency compared to hardware appliances. With the rise of cloud computing, load balancing evolved further to accommodate distributed cloud environments. Cloud providers like Amazon Web Services (AWS) and Microsoft Azure introduced their own load-balancing solutions that scaled automatically based on traffic and infrastructure needs.
Modern load balancers also pay a key role in cybersecurity, defending applications and networks against malicious attacks. They protect against SYN floods, application and network-level DDoS attacks, as well as attacks on web-facing applications. DDoS traffic may be rerouted to a dedicated DDoS protection appliance or service if the load balancers detect malicious traffic or if a server becomes vulnerable due to the volume of malicious requests. Many load balancers also come integrated with web application and API protection solutions (WAF and WAAP) and can also look at users’ requests to detect any attempts to send malicious data to applications.
A load balancer operates by acting as an intermediary between users and application servers, directing incoming client requests to the most appropriate server available. Here’s a breakdown of its key functions:
- Traffic monitoring and decision-making: When a client request arrives, the load balancer first inspects the state of each application server. It checks factors like server load, response times, and availability. Based on these metrics, the load balancer then decides which server will handle the incoming request most efficiently.
- Distribution algorithms: To route traffic, load balancers use algorithms that determine the most efficient server distribution.
- Health checks: Load balancers continuously monitor the health of each server to identify any instance that is unresponsive or down. If a server fails a health check, it is temporarily removed from the pool, and traffic is redirected to healthy servers. Once the failed server recovers, the load balancer reinstates it automatically.
- Session persistence: Also known as “sticky sessions,” this feature ensures that requests from the same client are consistently routed to the same server during a session. This is particularly useful for applications that rely on session data stored on specific servers, such as online shopping carts.
- SSL termination: For applications requiring secure HTTPS connections, load balancers can handle SSL termination by decrypting incoming traffic and passing it to application servers as plain HTTP. This reduces the computational load on application servers, which improves overall performance.
- Scalability and flexibility: In cloud environments, load balancers dynamically adjust as servers are added or removed based on traffic demand. For instance, during peak usage, they can distribute traffic across additional instances, ensuring scalability and cost efficiency.
The benefits of load balancing are numerous and include improved performance, scalability, and reliability:
Scalability:
Load balancing makes it easier to scale application resources up or down as needed, which helps to handle spikes in traffic or changes in demand and therefore saves on cost, especially in a cloud deployment where licensing is a pay-as-you-go (PAYG).
Improved Performance:
Load balancing helps to distribute the workload across multiple resources, which reduces the load on each application resource and improves the overall performance of the system.
Application Delivery:
In the context of application delivery, a load balancer serves as the single point of contact for clients. It distributes incoming application traffic across multiple application instances, deployed as software, hardware or cloud service instances, in multiple “availability zones” and regions to increase the availability of your application.
Cybersecurity:
In terms of cybersecurity, load balancers offer an extra layer of protection against network and application DDoS attacks and protect applications from malicious requests. Load balancers can help remove single points of failure, minimize the attack surface, and make it more difficult to exhaust resources and saturate links.
Reliability:
Load balancing ensures that there is no single point of failure in the system, which provides high availability and fault tolerance to handle application server failures.
The main types of load balancing algorithms are:
Round-Robin Technique:
The Round-Robin technique is one of the simplest methods for distributing client requests across a group of servers. Going down the list of servers in the group, the load balancer forwards a client request to each server in turn. When it reaches the end of the list, the load balancer loops back and goes down the list again. The main benefit of round-robin load balancing is that it is extremely simple to implement.
Weighted Round-Robin:
Weighted Round-Robin is an advanced load balancing configuration. This technique allows you to point records to multiple IP addresses like basic Round-Robin but has the added flexibility of distributing weight based on the needs of your domain. The servers assigned a higher weight get allocated a higher percentage of the incoming requests.
IP Hash:
IP Hash is a sophisticated load balancing technique that employs a specific type of algorithm known as a hashing algorithm. This algorithm plays a crucial role in managing network traffic and ensuring an even distribution of network load across multiple servers. The process begins when a data packet arrives at the network.
Least Connections Method:
The Least Connections method is a smart way to balance the workload among servers by directing each new user request to the server with the least number of active connections. This dynamic adjustment ensures efficient resource utilization and contributes to the overall performance and reliability of the network.
Adaptive Algorithms for Enhanced Performance:
Adaptive load balancing makes decisions based on status indicators retrieved by the load balancer from the application servers. The status gets determined by an agent running on each server. The load balancer queries each server regularly for this status information and then appropriately sets a dynamic weight for each server.
Load balancing can use one or more of the following methods:
Software Load Balancing:
Software load balancers, also called virtual load balancers, offer the same functionality as hardware load balancers but do not require a dedicated physical device. Software load balancers are applications installed on application servers or delivered as a native and/or managed cloud service. They route network traffic to different application servers by examining application-level characteristics like the IP address, the HTTP header, and to the specific listeners based on rules to match the contents of the request.
Hardware Load Balancing:
The primary goal of hardware load balancing is high performance. Hardware load balancers are physical devices that distribute network traffic across multiple application servers. Hardware load balancers use dedicated CPUs, hardware decryption, along with encryption hardware and specialized device drivers to speed up request and response processing, thus enabling high performance.
Cloud-Based Load Balancers:
Cloud-based load balancers are managed load balancing services offered by cloud providers, such as AWS Elastic Load Balancing, Google Cloud Load Balancing, and Azure Load Balancer. These services enable load balancing across resources distributed over multiple geographical regions or availability zones, improving resilience and scalability for applications.
Cloud load balancers dynamically adjust to changes in traffic demand and do not require on-premises hardware or maintenance. They support diverse types of traffic, including HTTP, HTTPS, TCP, and UDP, and often provide built-in redundancy and fault tolerance by routing traffic to the nearest available server or region. Additionally, these load balancers offer tight integration with other cloud services, such as auto-scaling and monitoring tools.
Network vs. Application Load Balancing:
Network Load Balancers (NLBs) operate at the transport layer (Layer 4 of the OSI model) and are designed for high performance, handling millions of requests per second with very low latencies. They are best suited for load balancing of TCP traffic.
Application Load Balancers (ALBs), on the other hand, operate at the application layer (Layer 7 of the OSI model). They are intended for load balancing of HTTP and HTTPS traffic and provide advanced features that enable traffic routing based on advanced traffic routing rules and content-based routing rules.
Use cases for Network-Level Load Balancing:
Network-level load balancing is particularly useful when you need to support high-volume inbound TCP requests.
Use cases for Application-Level Load Balancing:
Application-level load balancing is beneficial when you need to distribute incoming application traffic across multiple application servers, such as AWS EC2 or Azure VM application instances in multiple availability zones and regions. This increases the availability of your application and improves disaster resilience.
Load balancing, while essential for distributing network traffic and ensuring optimal performance, can introduce several security challenges, including:
Single Point of Failure:
While load balancing helps to reduce the risk of a single point of failure, it can also become a single point of failure if not implemented correctly. To prevent single points of failure for load balancers, they need to be deployed in an active-passive, active-active or in a cluster configuration. Many virtualization platforms enable configuring load balancers using templates, such as scale-sets in Azure, to enable high availability and scalability of network and application load balancers.
Security Risks:
If not implemented correctly, load balancing can introduce security risks such as allowing unauthorized access or exposing sensitive data. To protect against unauthorized access, default passwords for administration must be changed, regular patches should be applied, and administrator privileges must be strictly controlled. Many load balancers provide separate user, administrator, and viewer roles to further enhance security and protect against excessive permissions.
Vulnerabilities:
There can be vulnerabilities in the load balancer itself, its configuration, or its use. Enforce patching and keep track of any Common Vulnerabilities and Exposures (CVE) announcements.
For web-facing applications, load balancing is often employed to distribute user and application requests among several application servers. This reduces the strain on each application server and makes them more efficient, speeding up performance and reducing latency for the user. By distributing a large number of user requests among fewer application server instances, user wait time is vastly cut down, application server resources are more efficiently utilized, resulting in a better user experience and reduces cost of application delivery.
Load balancing helps to improve the overall performance and reliability of applications by ensuring that resources are used efficiently and that there is no single point of failure. It also helps to scale applications on demand and provides high availability and fault tolerance to handle spikes in traffic and across application server failures.
Load balancers improve application performance by directing requests to servers based on a chosen distribution algorithm, reducing response times, and improving user experience. It allows for processing more user requests with fewer application servers, prevents single-point failure, and ensures optimal server utilization, leading to cost savings.
By routing the requests to available application servers, load balancing takes the pressure off stressed servers and ensures optimal processing, high availability, and reliability. Load balancers dynamically add or drop servers in case of high or low demand. That way, it provides flexibility in adjusting to demand and reduces the cost of operation.
Radware provides advanced and comprehensive load balancing solutions and application delivery capabilities to ensure optimal service levels for applications in virtual, cloud, and software-defined data centers:
- Alteon offers completely isolated, PCI-compliant environments to enable the consolidation of multiple load balancers even in the entry-level platform. Alteon provides the same functionality, user interface, and codebase regardless of the form factor—whether physical, virtual, or in the cloud.
- With Global Elastic Licensing (GEL), Radware provides complete flexibility and cost control as you move from physical appliance to virtual or cloud. GEL allows you to reclaim the capacity from any form factor no longer in use, covers virtual, physical and cloud (AWS and Azure), and can be increased or decreased in 1GB increments. The reclaimed capacity may be redeployed in another environment. This capability enables our customers to transition to cloud from physical or virtual deployment without risk.
- Alteon provides unified security with an integrated Web Application Firewall (WAF) and API Protection (WAAP), Defense Messaging, Authentication Gateway, and Inbound and Outbound SSL Inspection. For protection of APIs, Radware WAF also includes integrations with Radware Bot Manager (available as a SaaS application) and our attack intelligence feed.
- Alteon is an application delivery controller (ADC) solution that provides local and global server load balancing for all web, cloud, and mobile applications.
- To extend and customize Alteon capability, scripting functionality using TCL called AppShape++ may be deployed. Unlike our competition, Radware adds reusable TCL scripts as Alteon built-in functionality in major releases, which makes them easily supportable across version upgrades, maintainable, and faster than interpreted scripts.
- Radware’s integrated WAF solution, available on all Alteon platforms, can be deployed in a unique out-of-path deployment, and provides an adaptive policy generation engine that offers greater coverage and lower false positive rates versus competition.
- Alteon provides automation out of the box to streamline the entire lifecycle of the application delivery with no integration efforts required. Alteon also integrates with many leading orchestration systems including VMware VRO, Cisco ACI, OpenStack Heat, HP Network Automation, Ansible, and many others.
These capabilities leverage various load-balancing algorithms to efficiently distribute network traffic and optimize application performance. Radware's load balancers are designed for the network operator (both ADC experts and non-experts) for automation of various operational tasks, significantly reducing the operational workload throughout the ADC services’ lifecycle.