Kubernetes Troubleshooting for Application Developers
Troubleshooting Scenarios
Pending Pods
In this lesson, we explore a common Kubernetes issue—pods remaining in the pending state. This guide covers three scenarios that cause pods to get stuck during scheduling and provides actionable solutions.
When a pod is in the pending state, Kubernetes has received the request to run it, but no available node meets the pod’s scheduling requirements. Let’s start by checking the pods in our cluster. As seen below, three pods are currently pending:
Below, we examine three examples that highlight different causes for pending pods.
Example 1: Data Processor Pod – Insufficient CPU
The first scenario involves the data processor pod. Running the describe command reveals that our cluster has two nodes. The scheduler indicates failures due to insufficient CPU on one node and an untolerated taint on the control plane node. The output is as follows:
Describe(staging/data-processor-64596d6fbfb-b7csp)
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-htgcx (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-htgcx:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations:
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
workload-machine-learning:NoSchedule
Events:
Type Reason Age From Message
------ ----- --- ---- -------
Warning FailedScheduling 27s (x2 over 5m49s) default-scheduler 0/2 nodes are available: 1 Insufficient cpu, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
The event logs indicate that the first node flagged "Insufficient cpu". To understand this, we inspect the pod's CPU requests:
Describe(staging/data-processor-64596d6fbfb-b7csp)
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-htgxc (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-htgxc:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations:
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
workload=machine-learning:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 27s (x2 over 5m49s) default-scheduler 0/2 nodes are available: 1 Insufficient cpu, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
The pod requires two CPUs. However, the node (ignoring the control plane) has a total of two CPUs and is already running three pods, leaving no capacity to satisfy the new request:
Describe(staging/data-processor-6459d6fbfb-b7csp)
Port: <none>
Host Port: <none>
Limits:
cpu: 2
Requests:
cpu: 2
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-htgcx (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-htgcx:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
Solution Tip
To resolve the issue, you can reduce the CPU request (for example, from "2" to "1") so that the pod fits into the available node capacity.
The pod specification defines both CPU requests and limits as shown below:
Here is the YAML snippet for the data processor deployment:
generation: 1
name: data-processor
namespace: staging
resourceVersion: "1550"
uid: 5456ffa2-213b-4bb1-a02b-7cab5327388
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: data-processor
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: data-processor
spec:
containers:
- image: vish/stress
imagePullPolicy: Always
name: data-processor
resources:
limits:
cpu: "2"
requests:
cpu: "2"
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
The Kubernetes scheduler only considers the CPU requests, not the limits. If a node has sufficient free resources based on requests, the pod is scheduled; otherwise, it remains pending. After updating the CPU request, the pod transitions from pending to running. If your workload truly demands the original allocation, consider scaling your cluster by adding a new node.
Example 2: ML API Pod – Node Selector Mismatch
The second example focuses on the ML API pod. The scheduler logs indicate that one node did not meet the pod's node affinity and that the control plane node carries an untolerated taint:
Context: kubernetes-admin@kubernetes
Cluster: kubernetes
User: kubernetes-admin
K9S Rev: v0.32.4
K8S Rev: v1.29.0
CPU: 3%
MEM: 53%
Describe(staging/ml-api-6b9bb6c9f4-n2rbm)
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4wb5b (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-4wb5b:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
type: gpu
Tolerations:
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
workload-machine-learning:NoSchedule
Events:
Type Reason Age Message
---- ------ --- -------
Warning FailedScheduling 3m55s (x2 over 9m17s) default-scheduler 0/2 nodes are available: 1 node(s) didn't match Pod's node affinity /selector, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
The pod specifies a node selector with the label "type=gpu", but none of the nodes have this label. To confirm, running:
kubectl describe pod staging/ml-api-6b9bb6c9f4-n2rbm
shows that the pod requires a node with "type: gpu". The solution is to label the appropriate node:
kubectl label nodes node01 type=gpu
After adding the label, the ML API pod moves to the ContainerCreating state, indicating that it has been successfully scheduled.
Example 3: Web App Pod – Untolerated Taint
The final scenario involves the web app pod, which remains pending due to untolerated taints related to the control plane and a taint from "workload: machine-learning". The pod description highlights this condition:
Context: kubernetes-admin@kubernetes
Cluster: kubernetes
User: kubernetes-admin
K9S Rev: v0.32.4
K8S Rev: v1.29.0
CPU: 3%
MEM: 52%
Name: web-app-564cb8d898-d521w
Namespace: staging
Priority: 0
Service Account: default
Node: <none>
Labels: app=webapp-color
pod-template-hash=564cb8d898
Annotations: <none>
Status: Pending
IP: <none>
IPs: <none>
Controlled By: ReplicaSet/web-app-564cb8d898
Containers:
webapp-color:
Image: kodekloud/webapp-color
Port: 8080/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s6w5r (ro)
Conditions:
Type: Status
PodScheduled: False
Volumes:
kube-api-access-s6w5r:
A closer look at the pod details confirms the absence of a toleration for the key "workload" with value "machine-learning":
Describe(staging/web-app-564cb8d898-d521w)
Host Port: 0/TCP
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s6w5r (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-s6w5r:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations:
- node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
- node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ --- ---- -------
Warning FailedScheduling 265s (x3 over 10m) default-scheduler 0/2 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 1 node(s) had untolerated taint {workload: machine-learning}. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
A taint prevents pods from being scheduled on a node unless they have an appropriate toleration. To fix this issue, modify the web app deployment to include the toleration for the taint "workload: machine-learning". Below is the deployment YAML snippet before the change:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: staging
annotations: {}
resourceVersion: "4365"
uid: 662b25b6-6953-4ab0-8b74-eaa4c00bc150
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: webapp-color
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: webapp-color
spec:
containers:
- name: webapp-color
image: kodekloud/webapp-color
imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
Include the following toleration in the pod specification to resolve the issue:
spec:
tolerations:
- key: workload
operator: Equal
value: machine-learning
containers:
- name: webapp-color
image: kodekloud/webapp-color
imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
After applying these changes, the updated deployment allows the pod to tolerate the taint, and it is scheduled successfully. A sample of the updated deployment is shown below:
creationTimestamp: "2024-06-07T23:08:07Z"
generation: 18
name: web-app
namespace: staging
resourceVersion: "4365"
uid: 662b25b6-6953-4ab0-8b74-eaa4c00bc150
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: webapp-color
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: webapp-color
spec:
tolerations:
- key: workload
operator: Equal
value: machine-learning
containers:
- name: webapp-color
image: kodekloud/webapp-color
imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
Summary
In this lesson, we reviewed three common scenarios that cause pods to remain in the pending state:
- The data processor pod was pending due to insufficient CPU availability. The issue was resolved by reducing the CPU request (or by scaling the cluster).
- The ML API pod was pending because it required a node that matched a specific node selector ("type: gpu"). Labeling the node correctly allowed it to schedule.
- The web app pod was pending due to an untolerated taint. Adding an appropriate toleration in the deployment enabled the pod to be scheduled.
By understanding these challenges and solutions, you can ensure your pods are scheduled successfully in Kubernetes. Happy troubleshooting!
Watch Video
Watch video content