Resource Requirements for Pods
A fundamental aspect of Kubernetes is the proper allocation of resources to pods. In a pod specification, you define two critical parameters:- Resource Requests: Guarantee that the container will receive a specific amount of resources. For instance, if a container requests 64 megabytes of memory and 250 CPU units (i.e., 250m), Kubernetes schedules it only on nodes that can provide these resources.
- Resource Limits: Ensure that a container does not use more resources than allowed. In our example, the container has a limit of 128 megabytes of memory and 500 milli CPU units, preventing it from exceeding this threshold.
Introduction to the Vertical Pod Autoscaler
The Vertical Pod Autoscaler (VPA) is designed to adjust CPU and memory resources for pods based on real-time workload demands. It continuously monitors resource utilization and recommends or enacts changes to maintain optimal performance. Imagine a scenario where the VPA’s recommender component notices that a pod is frequently pushing its resource limits—such as using 190Mi out of 192Mi of memory or 700m out of 750m CPU. In such cases, the recommender might suggest increasing both the memory and CPU allocations to ensure stability and performance.VPA Components Explained
-
Recommender:
The recommender monitors resource usage and provides suggestions to adjust requests and limits based on observed behavior. -
Updater:
The updater compares the current pod configuration with the recommended settings. If differences are detected, it initiates actions (such as evicting the pod) to apply the new resource allocations. Note that Kubernetes traditionally does not allow the direct modification of running pods’ resource requests. Starting with Kubernetes version 1.28, in-place updates of pod resources reduce or eliminate the need for evictions. -
Admission Controller:
This component intercepts new pod creation requests. When a pod is restarted or created (for example, after an eviction), the admission controller updates the pod’s resource configuration based on the recommender’s suggestions.
Update Policy Modes
VPA supports different update policy modes to control how resource recommendations are applied:-
Off:
The VPA provides recommendations but does not automatically update the pod’s resource requests. Manual intervention is required to apply any suggested changes. -
Initial:
Automatically sets optimal resource requests at pod creation but does not modify running pods. -
Auto:
Automatically applies recommended resource adjustments both during pod creation and throughout the pod’s lifecycle. In Kubernetes versions prior to 1.28, this mode required pod evictions to apply changes. Although more aggressive, the Auto mode ensures that pods run with the optimal configuration based on usage patterns.
Deploying the VPA Components
The VPA is not enabled by default in Kubernetes clusters. To deploy the VPA components, follow these steps:- Clone the Kubernetes autoscaler repository.
- Navigate to the vertical pod autoscaler directory.
- Execute the provided hack script to deploy the necessary components.
Configuring a Vertical Pod Autoscaler Object
After deploying the VPA components, the next step is to create a VPA object that references your target pod and defines its resource parameters. In this example, a VPA is configured for a pod named “simple-webapp-color” with the update mode set to “Off”, meaning the VPA will only provide recommendations without automatically modifying the pod. First, here is the pod specification for the web application:For information on working with Kubernetes resource allocations and autoscaling, refer to Kubernetes Basics.