Design Considerations for ExpressRoute Deployments
Guidance for designing resilient, high-performance Azure ExpressRoute deployments covering gateway SKU, circuit sizing and billing, BGP peering, redundancy, connection models, BFD and operational readiness
When planning an Azure ExpressRoute deployment, evaluate the key design factors that affect availability, performance, and cost. This guide walks through the considerations in the same logical sequence used in operational design reviews so you can make consistent, production-ready choices.
Choose an ExpressRoute gateway SKU that matches your required throughput, resiliency, and feature set.Considerations:
Throughput: select a SKU capable of your sustained and peak bandwidth needs (include headroom for growth).
Resiliency: check whether the SKU supports availability zones, active/active configurations, or other HA capabilities.
Features and connections: ensure the SKU supports the number and type of BGP sessions, VPN fallback, and any cross-region connectivity required.
Tip: For business-critical applications choose higher-capacity SKUs to avoid mid-life migrations. Validate the gateway SKU’s documented maximum tunnels, throughput, and HA behavior in the official Azure docs.
BFD provides fast detection of path failure and complements BGP for rapid failover on ExpressRoute.What BFD provides:
Sub-second to low-millisecond detection of path troubles (depending on your timers).
Quicker switchovers to backup paths compared to relying on BGP timer convergence alone.
Reduced application disruption by accelerating failover.
How BFD is used with ExpressRoute:
Microsoft’s edge routers have BFD enabled for supported peering types; you must enable and configure BFD on your on-premises routers and bind it to the specific BGP sessions.
Apply consistent BFD timers on both primary and secondary routers and validate behavior under failure conditions.
BFD is enabled by default on Microsoft’s edge routers for supported peering types. You must explicitly configure BFD on your on-premises devices and bind it to the corresponding BGP sessions to obtain the rapid failover benefits.
Implementation notes and examples:
Align BFD interval and detection multiplier to balance rapid detection and false positives.
Test different failure scenarios (link down, interface flap, route withdrawal) to confirm expected behavior.
Monitor both BFD and BGP session states and alert on state changes.
Example BFD configuration templates (replace placeholders):Cisco IOS example:
A resilient ExpressRoute deployment balances the right gateway SKU, circuit bandwidth and billing model, peering choices, diverse connectivity, and rapid failover via BFD. Plan BGP and route policies carefully, validate redundancy end-to-end, and document operational procedures. These steps help deliver a highly available, high-performance private connection between your on-premises infrastructure and Azure.