Overview of Cilium Cluster Mesh features, setup steps, prerequisites, and KVStoreMesh scalability
In this lesson we cover Cluster Mesh fundamentals: what Cluster Mesh enables, required cluster prerequisites, how to configure and enable it in Cilium, how to connect clusters into a full mesh, and why KVStoreMesh improves Cluster Mesh scalability.Cluster Mesh lets multiple Kubernetes clusters behave as a single multi-cluster network fabric by providing:
Cross-cluster network connectivity (pods can talk across clusters).
Cross-cluster load balancing (services can balance across backends in other clusters).
Shared security controls (apply Kubernetes NetworkPolicies across clusters).
Example: with cluster1, cluster2, and cluster3 joined into a Cluster Mesh, pods in different clusters can communicate by default according to mesh-wide connectivity and network policies.
Cross-cluster load balancing lets a frontend pod in one cluster send requests that are distributed among backends across multiple clusters — useful for global-scale services and failover.
You can also enforce fine-grained cross-cluster access using Kubernetes NetworkPolicies. For example, allow frontend pods from cluster1 and cluster2 to reach a backend in cluster2 while blocking requests originating from cluster3.
These are the main features covered below: prerequisites, per-cluster configuration, enablement, cluster connections, and KVStoreMesh design.
Before joining clusters into a Cluster Mesh, verify the following requirements across all clusters:
Requirement
Why it matters
Matching datapath mode
All clusters should use the same datapath (e.g., encapsulation/tunnel or native routing) to avoid connectivity and routing mismatches.
Non-overlapping Pod CIDRs
Pods in different clusters must use unique IP ranges to prevent address conflicts.
Full node-to-node IP connectivity
Nodes across clusters must be able to reach each other (or through a suitable networking fabric) for cross-cluster traffic and service access.
Unique cluster identifiers
Each cluster needs a unique cluster name and integer ID in Cilium configuration to avoid collisions.
When configuring Cilium per-cluster, ensure the Cilium config includes unique Pod CIDR pool entries and a unique cluster name and ID. A representative YAML fragment for two clusters:
After deploying Cilium on each cluster, enable Cluster Mesh with the cilium clustermesh enable command. If your environment cannot automatically provision an appropriate LoadBalancer or service type, you can supply a service type explicitly.Example: enable Cluster Mesh on two clusters using LoadBalancer service type:
After enabling Cluster Mesh on each cluster, establish mesh connections using cilium clustermesh connect. Provide the source and destination kubeconfig contexts:
You only need to run the connect command in one direction per pair. For a mesh of three clusters, ensure all pairwise connections exist (for full mesh topology: cluster1→cluster2, cluster1→cluster3, cluster2→cluster3), or use scripts/automation to configure full-mesh connectivity.Check the Cluster Mesh status with:
Copy
cilium clustermesh status
Example status output:
Copy
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found✅ Cluster access information is available: - 172.19.255.46:2379✅ Deployment clustermesh-apiserver is readyℹ️ KVStoreMesh is enabled✅ All 3 nodes are connected to all clusters [min:1 / avg:1.0 / max:1]✅ All 1 KVStoreMesh replicas are connected to all clusters [min:1 / avg:1.0 / max:1]🪄 Cluster Connections: - cluster1: 3 configured, 3/3 connected KVStoreMesh: 1/1 configured, 1/1 connected🔁 Global services: [ min:1 / avg:1.0 / max:1 ]
Once connections are established, pods across clusters can communicate according to configured services and network policies.
KVStoreMesh is a design improvement introduced to scale Cluster Mesh. Understanding its role helps when planning and troubleshooting multi-cluster environments.Original design challenges:
Each Cilium agent/operator wrote resources (Services, CiliumNodes, identities, endpoints) into the cluster Kubernetes API.
The Cluster Mesh API server would watch the Kubernetes API and sync that data into a central etcd.
Remote agents had to watch many remote etcd instances; at scale this caused high synchronization load, increased latency, and heavy etcd pressure.
KVStoreMesh design:
Each cluster runs a local KVStoreMesh binary and a local etcd instance containing mesh-wide state.
Cilium agents sync only with the local KV store.
KVStoreMesh instances synchronize state between clusters, reducing the number of remote endpoints that each agent must watch.
Benefits:
Reduced overall etcd usage and pressure.
More balanced load across cluster KV stores.
Lower impact from agent restarts, workload churn, or new clusters joining the mesh.
KVStoreMesh is enabled by default in modern Cilium Cluster Mesh deployments.
KVStoreMesh is enabled by default. If you must disable it for compatibility reasons, pass —kvstore-mesh=false when enabling Cluster Mesh.