Welcome to the VPA Memory Lab! In this tutorial, you’ll deploy a sample Flask application on Kubernetes, monitor its memory usage, configure a Vertical Pod Autoscaler (VPA), generate load, and review VPA memory recommendations. By the end, you’ll understand how VPA adjusts resource requests to match real-world demand.Documentation Index
Fetch the complete documentation index at: https://notes.kodekloud.com/llms.txt
Use this file to discover all available pages before exploring further.

Step 1: Deploy the Sample Flask Application
Apply the deployment and service manifest:Step 2: Check Current Resource Usage
Before load testing, inspect CPU and memory usage:| Metric | Description | Command |
|---|---|---|
| CPU Usage | Current CPU consumption per pod | kubectl top pod |
| Memory Usage | Current memory consumption per pod | kubectl top pod |
Step 3: Apply the VPA Configuration
Create a VPA manifest (vpa-memory.yaml) to manage memory requests for your Flask deployment:
The
updateMode: Off setting prevents VPA from automatically updating pods. You’ll receive recommendations only.Initial VPA Recommendation
Check the initial memory recommendation:| Field | Meaning |
|---|---|
| lowerBound | Minimum request to ensure stability |
| target | Ideal request within policy bounds |
| uncappedTarget | Recommendation without considering min/max limits |
| upperBound | Maximum request allowed by the VPA policy |
Step 4: Run a Load Test and Validate Recommendations
Generate load against your Flask application:- Adjust
maxAllowedin the VPA policy - Consider a Horizontal Pod Autoscaler for scaling out
After validating recommendations, you can switch
updateMode to Auto or Recreate to let VPA apply changes automatically.