Kubernetes Networking Deep Dive

Kubernetes Services

Troubleshooting Internal Networking

When Kubernetes networking breaks, identifying the root cause quickly is crucial. This guide highlights common troubleshooting scenarios—CNI issues, network policies, DNS/service discovery, and service-endpoint-pod connectivity. Follow the structured steps below to restore cluster networking.

The image shows a slide with the title "Section Objectives" and a point stating "Discuss common scenarios that will require troubleshooting."

Networking in Kubernetes depends on: The image shows four colored icons representing different scenarios: CNIs, Network Policies, Service Discovery and DNS, and Services, Endpoints, and Pods.

ScenarioFocusKey Commands
CNIPod network agents & connectivitykubectl get pods -n kube-system, cilium status
Network PoliciesIngress/Egress filterskubectl get networkpolicies, ping, nc, curl
Service Discovery & DNSCoreDNS health & resolutionkubectl logs coredns, nslookup, dig
Services & EndpointsService definitions & backendskubectl describe svc, kubectl get endpoints

1. Troubleshooting CNIs

All Container Network Interfaces (CNIs) run as pods. Start by validating their status:

  1. Check CNI pod status

    • Run kubectl get pods -n kube-system and look for restarts or CrashLoop.
    • Inspect events: kubectl describe pod <cni-pod> -n kube-system.
    • Review logs: kubectl logs <cni-pod> -n kube-system.

    The image illustrates a diagram showing "CNI Pods" with arrows pointing to "Logs" and "Events," indicating data flow or communication.

  2. Verify node health

    • Confirm kubelet and the container runtime (Docker, containerd) are Running.
    • For Cilium users, cilium node status shows kernel modules, BPF maps, and node health.

    The image is an informational graphic about "Cilium," featuring its logo and three sections: Requirements, Verification, and Tool Utilization, each with brief descriptions.

  3. Use CNI-specific tools
    Many CNIs include CLIs and connectivity tests:

    • Cilium CLI: cilium status, cilium connectivity test
    • Hubble: Visualize flows and policy enforcement

    Note

    Deploy automated connectivity tests to validate pod-to-pod networking before diving deeper.

    The image is a diagram illustrating the concept of CNIs, showing connections between command-line utilities, automated testing deployments, and status checking, with various geometric shapes in the center.
    The image lists three CNIs: Cilium CLI, Networking Connectivity Test, and Hubble, each with a brief description of their functions.


2. Troubleshooting Network Policies

Misconfigured or missing NetworkPolicies can silently block traffic:

  1. Locate policies

    kubectl get networkpolicies --all-namespaces
    

    If no policies exist, skip to other troubleshooting areas.

  2. Review selectors and intent

    • Ensure podSelector and namespaceSelector match the intended workload.
    • Overly broad selectors may catch nothing; too narrow may block all traffic.
  3. Verify ingress/egress rules
    An empty list blocks traffic by default. Confirm each rule explicitly allows the necessary ports and protocols.

    Warning

    An empty network policy blocks all ingress and egress. Always define at least one rule.

    The image illustrates network policies in a Kubernetes environment, showing a pod's communication being blocked by network policies, with potential issues like misconfiguration, deployment errors, and accidental policy deletion.
    The image is a diagram titled "Network Policies" with three steps: "Review Policy Purpose," "Check Policy Selectors," and "Verify Policy Rules," accompanied by an icon of a magnifying glass over a gear.
    The image outlines steps for network policies, including reviewing policy purpose, checking policy selectors, and verifying policy rules, with a note on ensuring ingress and egress rules are defined.

  4. Test connectivity
    Launch pods in both allowed and denied namespaces and validate traffic flows:

    • ping <pod-IP>
    • nc -zv <pod-IP> <port>
    • curl http://<service>

    The image illustrates network policies with a focus on testing connectivity using tools like ping, netcat, nmap, and curl, and shows two namespaces each containing a pod.


3. Troubleshooting Service Discovery & DNS

CoreDNS manages internal name resolution. Follow these steps:

  1. Check CoreDNS pods

    kubectl get pods -n kube-system -l k8s-app=kube-dns
    

    Ensure pods are Running, then kubectl logs for errors.

  2. Inspect ConfigMap

    kubectl get configmap coredns -n kube-system -o yaml
    

    Look for syntax errors or missing zones.

  3. Validate pod DNS settings
    Inside a test pod, check /etc/resolv.conf matches your cluster DNS IP.

  4. Test DNS resolution

    nslookup kubernetes.default
    dig @<coredns-ip> my-service.my-namespace.svc.cluster.local
    

    The image is about service discovery and DNS, focusing on CoreDNS. It highlights checking the CoreDNS configmap in the kube-system namespace and lists possible issues like incorrect reconfiguration, DNS file deletion, and specific namespace resolution.


4. Troubleshooting Services, Endpoints & Pods

Connectivity issues here often stem from selector or port mismatches:

  1. Check pod health

    • Pods should be Running without restarts.
    • Look for CrashLoopBackOff in kubectl describe pod.
    • Review logs for errors or resource exhaustion.

    The image is a diagram about "Services, Endpoints, and Pods," showing a pod icon with sections for "Events," "Status," and "Logs," alongside a checklist for pod health and issues.

  2. Validate services

    • Confirm service type suits your use case (ClusterIP, NodePort, LoadBalancer).
    • Check spec.selector labels match pod labels.
    • Verify service ports map to container ports.
    • Ensure the application listens on the advertised port.

    The image is a diagram titled "Service Validation" with six connected steps: Confirm Service Type, Understand Service Purpose, Validate Pod Selectors, Verify Port Configurations, Ensure Proper Pod Configuration, and Validate Image Port Configuration.

  3. Compare Services and Endpoints
    Each Service should have a corresponding Endpoints object:

    kubectl get endpoints <service-name>
    

    Verify the IPs match the target pods to avoid silent failures.

    The image illustrates the relationship between services, endpoints, and IPs, highlighting a potential loss of connectivity.

  4. Port-forward as needed

    kubectl port-forward svc/<service> 8080:<port>
    

    This isolates the service without external load balancers.


Next, apply these techniques on a live cluster to reinforce your troubleshooting skills.

Watch Video

Watch video content

Previous
Demo Endpoints and Endpoint Slices