Skip to main content
A load balancer is a network component that automatically distributes incoming client traffic across multiple backend resources (for example, virtual machines, containers, or servers). Its primary goals are to prevent any single resource from becoming a bottleneck, remove single points of failure, and improve application availability, reliability, and performance.

Traditional single-server setup

  • In a typical single-server design, the DNS record for www.kodekloud.com resolves to one public IP bound directly to a single backend server.
  • All client requests hit that single server. If the server is overloaded or fails, users experience downtime or degraded performance because there is no automated traffic distribution or failover.

Introducing a load balancer

  • Instead of pointing DNS directly to one server IP, www.kodekloud.com resolves to the public (frontend) IP of a load balancer.
  • The load balancer sits in front of a pool of backend servers (the backend pool). It receives client requests and forwards them to backend instances using a configured algorithm (for example round robin, least connections, or IP-hash).
  • Health probes periodically check backend instances. When a probe detects an unhealthy instance, the load balancer stops sending traffic to that instance and routes requests to healthy backends.
  • This design enables seamless maintenance, instance replacement, and resilience during transient failures.

How load balancing works

  • Frontend receives the client connection on its public IP and port.
  • Load-balancing algorithm selects an appropriate backend instance.
  • Health probes (TCP, HTTP, or HTTPS) verify backend health before the instance receives traffic.
  • Session persistence (sticky sessions) optionally pins a client to a backend for stateful sessions.
  • If an instance fails health checks, traffic is rerouted automatically to healthy backends until recovery.

Key components and concepts

ComponentPurposeExample / Notes
Frontend IPPublic IP address clients resolve and connect towww.kodekloud.com → load balancer public IP
Backend poolGroup of servers/VMs/containers that handle requestsVM Scale Set, container group, or Kubernetes service backends
Health probesPeriodic checks (TCP/HTTP/HTTPS) to determine instance healthHTTP probe to /healthz every 10s
Load-balancing algorithmMethod used to select target backendRound robin, least connections, IP-hash
Session persistenceKeeps a client bound to a specific backendCookie-based or source IP affinity
Layer (L4 vs L7)Transport vs Application layer operation and routing capabilitiesL4: TCP/UDP; L7: HTTP/HTTPS with path/host routing

Layer 4 vs Layer 7 — when to use which

LayerWhat it handlesUse case
Layer 4 (Transport)TCP/UDP connection forwardingLow-latency, protocol-agnostic forwarding (e.g., generic TCP services)
Layer 7 (Application)HTTP/HTTPS inspection and routingWeb apps that need path- or host-based routing, header rules, SSL termination

Benefits

  • High availability: traffic automatically reroutes away from unhealthy instances.
  • Fault tolerance: eliminates single points of failure.
  • Scalability: add or remove backend instances without changing the public-facing IP.
  • Performance: spreads load so no single server is overwhelmed.
  • Maintenance flexibility: perform updates with little to no user-visible downtime.

Best practices

  • Configure appropriate health probes that reflect real application readiness (not just TCP).
  • Choose a load-balancing algorithm that matches your workload (stateless workloads often work well with round robin).
  • Use session persistence only when required by the application; aim for stateless services when possible.
  • Monitor latency, error rates, and backend capacity so you can scale proactively.
  • Secure the frontend (TLS termination) and validate backend trust boundaries.

Further reading and references

Configure health probes and monitoring when you create a load balancer. Properly tuned probes and a suitable load-balancing algorithm are essential for fast failover and stable performance.