2025 Updates Introduction to Autoscaling

In this lesson, we explore autoscaling in Kubernetes with a focus on exam-relevant scenarios for the CKA exam. We will cover Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA) while also providing the fundamental concepts necessary for a deeper understanding.

The image explains autoscaling in Kubernetes, highlighting Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA), with a course card featuring tutors and beginner level.

For an in-depth look at Kubernetes autoscaling, consider enrolling in the Kubernetes Autoscaling course.

Note

Before diving into autoscaling in Kubernetes, it's beneficial to understand the basic concepts of scaling using traditional physical servers.

Traditional Scaling Concepts

Historically, applications were deployed on physical servers with fixed CPU and memory capacities. When demand increased and resources were exhausted, the only option was to perform vertical scaling. This involved:

Shutting down the application.
Upgrading the CPU or memory.
Restarting the server.

This process is referred to as vertical scaling since it focuses on enhancing the capacity of an existing server.

Conversely, if an application supported multiple instances, additional servers could be added to handle increased loads without any downtime. This method, known as horizontal scaling, distributes the workload by creating more instances of the application.

Key Points:

Vertical Scaling: Increases resources (CPU, memory) of an existing server.
Horizontal Scaling: Increases server count by adding more instances.

The image illustrates horizontal and vertical scaling concepts, showing CPU and memory resources, with icons representing users and computing units.

Scaling in Kubernetes

Kubernetes is specifically designed for hosting containerized applications and incorporates scaling based on current demands. It supports two main scaling types:

Workload Scaling: Adjusting the number of containers or Pods running in the cluster.
Cluster (Infrastructure) Scaling: Adding or removing nodes (servers) from the cluster.

The image illustrates a system architecture for scaling cluster infrastructure and workloads, featuring orchestration and multiple nodes with containerized applications.

When scaling in a Kubernetes cluster, consider the following:

Cluster Infrastructure Scaling:
- Horizontal Scaling: Add more nodes.
- Vertical Scaling: Enhance the resources (CPU, memory) of existing nodes.
Workload Scaling:
- Horizontal Scaling: Create additional Pods.
- Vertical Scaling: Modify the resource limits and requests for existing Pods.

The image illustrates scaling strategies: horizontal and vertical scaling for cluster infrastructure, and horizontal scaling for workloads.

Approaches to Scaling in Kubernetes

Kubernetes supports both manual and automated scaling methods.

Manual Scaling

Manual scaling requires intervention from the user:

Infrastructure Scaling (Horizontal): Provision new nodes and join them to the cluster using:
```
kubectl join ...
```
Workload Scaling (Horizontal): Adjust the number of Pods manually with:
```
kubectl scale ...
```
Pod Resource Adjustment (Vertical): Edit the deployment, stateful set, or replica set to modify resource limits and requests:
```
kubectl edit ...
```

Automated Scaling

Automation in Kubernetes simplifies scaling and ensures efficient resource management:

Kubernetes Cluster Autoscaler: Automatically adjusts the number of nodes in the cluster by adding or removing nodes when needed.
Horizontal Pod Autoscaler (HPA): Monitors metrics and adjusts the number of Pods dynamically.
Vertical Pod Autoscaler (VPA): Automatically changes resource allocations for running Pods based on observed usage.

Key Benefit

Automated scaling mechanisms in Kubernetes allow your applications and infrastructure to adapt quickly to changing loads, reducing manual effort and ensuring optimal performance.

This lesson provided an overview of the fundamental concepts of autoscaling in Kubernetes. In the upcoming sections, we will explore each autoscaling component in greater detail.

I'll see you in the next part of this lesson.

Watch Video

Watch video content