In this guide, you’ll learn how to configure Kubernetes Horizontal Pod Autoscaler (HPA) scaling policies. HPA scaling policies define rules that direct the control plane on adjusting pod replicas based on metrics like CPU, memory, or custom/external metrics. Think of these rules as the “brain” that scales your application in sync with demand—adding pods when needed and removing them when load subsides.Documentation Index
Fetch the complete documentation index at: https://notes.kodekloud.com/llms.txt
Use this file to discover all available pages before exploring further.
Defining Scaling Policies
Under the HPA manifest’sbehavior section, the policies field specifies how scaling decisions occur. Each policy entry includes:
- type:
Podsfor a fixed number of pods orPercentfor a percentage of current replicas. - value: A numeric value corresponding to the type.
- periodSeconds: Minimum interval between consecutive scaling actions under this rule.
| Type | Value | periodSeconds | Effect |
|---|---|---|---|
| Pods | 4 | 60 | Adds up to 4 pods every 60 seconds |
| Percent | 10 | 60 | Increases replicas by up to 10% every 60 seconds |
periodSeconds has elapsed. Combining fixed and percentage-based policies provides precise control over autoscaling behavior.
You can adjust the default 15-second metric scraping interval via your metrics-server or an external metrics adapter configuration.
Stabilization Window
To prevent rapid fluctuations (thrashing) in pod counts, HPA uses a stabilization window. This buffer retains recent scaling recommendations and ensures you scale up quickly while scaling down more conservatively.| Window Type | Default Interval | Purpose |
|---|---|---|
| scale-up | 0 seconds | Immediate scaling up based on current metrics |
| scale-down | 300 seconds (5 min) | Delays scale-down by using the highest recommendation in the last 5 min |
Customizing the Scale-Down Window
Adjust your HPA manifest to modify the scale-down stabilization window:- At minute 1, metrics recommend scaling up to 40 pods.
- At minute 2, metrics suggest scaling down to 30 pods, but the window holds at 40 pods.
- From minutes 3–5, even if load decreases further, HPA maintains 40 pods until the buffer expires.
