Skip to main content
Welcome to the VPA Memory Lab! In this tutorial, you’ll deploy a sample Flask application on Kubernetes, monitor its memory usage, configure a Vertical Pod Autoscaler (VPA), generate load, and review VPA memory recommendations. By the end, you’ll understand how VPA adjusts resource requests to match real-world demand.
The image outlines a four-step process for a VPA Memory Lab, including deploying a sample application, monitoring resource usage, applying VPA configuration, and conducting a load test. It features a stylized computer icon and is copyrighted by KodeKloud.

Step 1: Deploy the Sample Flask Application

Apply the deployment and service manifest:
kubectl apply -f vpa-testing.yml
Expected output:
deployment.apps/flask-app created
service/flask-app-service created
Confirm the pods are running:
kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
flask-app-b85fc57d4-kmsdl   1/1     Running   0          2m
flask-app-b85fc57d4-zn77q   1/1     Running   0          2m

Step 2: Check Current Resource Usage

Before load testing, inspect CPU and memory usage:
kubectl top pod
NAME                        CPU(cores)   MEMORY(bytes)
flask-app-b85fc57d4-g2n7n   1m           19Mi
flask-app-b85fc57d4-mq6kb   1m           19Mi
MetricDescriptionCommand
CPU UsageCurrent CPU consumption per podkubectl top pod
Memory UsageCurrent memory consumption per podkubectl top pod

Step 3: Apply the VPA Configuration

Create a VPA manifest (vpa-memory.yaml) to manage memory requests for your Flask deployment:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: flask-app
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: flask-app
  updatePolicy:
    updateMode: "Off"
  resourcePolicy:
    containerPolicies:
      - containerName: '*'
        controlledResources: ["memory"]
        minAllowed:
          memory: 150Mi
        maxAllowed:
          memory: 1000Mi
The updateMode: Off setting prevents VPA from automatically updating pods. You’ll receive recommendations only.
Apply the VPA manifest:
kubectl apply -f vpa-memory.yaml
verticalpodautoscaler.autoscaling.k8s.io/flask-app created

Initial VPA Recommendation

Check the initial memory recommendation:
kubectl get vpa flask-app -o yaml
status:
  conditions:
    - type: RecommendationProvided
      status: "True"
      lastTransitionTime: "2025-01-15T08:10:03Z"
  recommendation:
    containerRecommendations:
      - containerName: flask-app
        lowerBound:
          memory: 262144k
        target:
          memory: 262144k
        uncappedTarget:
          memory: 262144k
        upperBound:
          memory: 1000Mi
FieldMeaning
lowerBoundMinimum request to ensure stability
targetIdeal request within policy bounds
uncappedTargetRecommendation without considering min/max limits
upperBoundMaximum request allowed by the VPA policy

Step 4: Run a Load Test and Validate Recommendations

Generate load against your Flask application:
sh load.sh
Once the load test completes, retrieve the updated VPA recommendation:
kubectl get vpa flask-app -o yaml
status:
  recommendation:
    containerRecommendations:
      - containerName: flask-app
        lowerBound:
          memory: 262144k
        target:
          memory: "511772K"
        uncappedTarget:
          memory: "511772K"
        upperBound:
          memory: 1000Mi
Notice that target has increased to meet higher memory demands under load. If the uncappedTarget exceeds your upperBound, you can:
After validating recommendations, you can switch updateMode to Auto or Recreate to let VPA apply changes automatically.