Describes load balancers that distribute client traffic across backend servers using health probes and algorithms to ensure availability, scalability, fault tolerance, and Layer 4 versus Layer 7 routing.
A load balancer is a network component that automatically distributes incoming client traffic across multiple backend resources (for example, virtual machines, containers, or servers). Its primary goals are to prevent any single resource from becoming a bottleneck, remove single points of failure, and improve application availability, reliability, and performance.
In a typical single-server design, the DNS record for www.kodekloud.com resolves to one public IP bound directly to a single backend server.
All client requests hit that single server. If the server is overloaded or fails, users experience downtime or degraded performance because there is no automated traffic distribution or failover.
Instead of pointing DNS directly to one server IP, www.kodekloud.com resolves to the public (frontend) IP of a load balancer.
The load balancer sits in front of a pool of backend servers (the backend pool). It receives client requests and forwards them to backend instances using a configured algorithm (for example round robin, least connections, or IP-hash).
Health probes periodically check backend instances. When a probe detects an unhealthy instance, the load balancer stops sending traffic to that instance and routes requests to healthy backends.
This design enables seamless maintenance, instance replacement, and resilience during transient failures.
Configure health probes and monitoring when you create a load balancer. Properly tuned probes and a suitable load-balancing algorithm are essential for fast failover and stable performance.