Kubernetes Autoscaling

Vertical Pod Autoscaler VPA

VPA Memory Lab

Welcome to the VPA Memory Lab! In this tutorial, you’ll deploy a sample Flask application on Kubernetes, monitor its memory usage, configure a Vertical Pod Autoscaler (VPA), generate load, and review VPA memory recommendations. By the end, you’ll understand how VPA adjusts resource requests to match real-world demand.

The image outlines a four-step process for a VPA Memory Lab, including deploying a sample application, monitoring resource usage, applying VPA configuration, and conducting a load test. It features a stylized computer icon and is copyrighted by KodeKloud.

Step 1: Deploy the Sample Flask Application

Apply the deployment and service manifest:

kubectl apply -f vpa-testing.yml

Expected output:

deployment.apps/flask-app created
service/flask-app-service created

Confirm the pods are running:

kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
flask-app-b85fc57d4-kmsdl   1/1     Running   0          2m
flask-app-b85fc57d4-zn77q   1/1     Running   0          2m

Step 2: Check Current Resource Usage

Before load testing, inspect CPU and memory usage:

kubectl top pod
NAME                        CPU(cores)   MEMORY(bytes)
flask-app-b85fc57d4-g2n7n   1m           19Mi
flask-app-b85fc57d4-mq6kb   1m           19Mi
MetricDescriptionCommand
CPU UsageCurrent CPU consumption per podkubectl top pod
Memory UsageCurrent memory consumption per podkubectl top pod

Step 3: Apply the VPA Configuration

Create a VPA manifest (vpa-memory.yaml) to manage memory requests for your Flask deployment:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: flask-app
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: flask-app
  updatePolicy:
    updateMode: "Off"
  resourcePolicy:
    containerPolicies:
      - containerName: '*'
        controlledResources: ["memory"]
        minAllowed:
          memory: 150Mi
        maxAllowed:
          memory: 1000Mi

Warning

The updateMode: Off setting prevents VPA from automatically updating pods. You’ll receive recommendations only.

Apply the VPA manifest:

kubectl apply -f vpa-memory.yaml
verticalpodautoscaler.autoscaling.k8s.io/flask-app created

Initial VPA Recommendation

Check the initial memory recommendation:

kubectl get vpa flask-app -o yaml
status:
  conditions:
    - type: RecommendationProvided
      status: "True"
      lastTransitionTime: "2025-01-15T08:10:03Z"
  recommendation:
    containerRecommendations:
      - containerName: flask-app
        lowerBound:
          memory: 262144k
        target:
          memory: 262144k
        uncappedTarget:
          memory: 262144k
        upperBound:
          memory: 1000Mi
FieldMeaning
lowerBoundMinimum request to ensure stability
targetIdeal request within policy bounds
uncappedTargetRecommendation without considering min/max limits
upperBoundMaximum request allowed by the VPA policy

Step 4: Run a Load Test and Validate Recommendations

Generate load against your Flask application:

sh load.sh

Once the load test completes, retrieve the updated VPA recommendation:

kubectl get vpa flask-app -o yaml
status:
  recommendation:
    containerRecommendations:
      - containerName: flask-app
        lowerBound:
          memory: 262144k
        target:
          memory: "511772K"
        uncappedTarget:
          memory: "511772K"
        upperBound:
          memory: 1000Mi

Notice that target has increased to meet higher memory demands under load. If the uncappedTarget exceeds your upperBound, you can:

Next Steps

After validating recommendations, you can switch updateMode to Auto or Recreate to let VPA apply changes automatically.


Watch Video

Watch video content

Practice Lab

Practice lab

Previous
VPA Setup Demo