An Introduction to Azure ExpressRoute Architecture

This post will give you an overview of Azure ExpressRoute architecture. This is not a “how to” post; instead, the purpose of this post is to document the options for architecting connectivity with Microsoft Azure in one concise (as much as possible) document.

Introduction to ExpressRoute

Azure ExpressRoute is a form of private Layer-2 or Layer-3 network connectivity between a customer’s on-premises network(s) and a virtual network hosted in Microsoft Azure. ExpressRoute is one of the 2 Azure-offered solutions (also, VPN) for achieving a private network connection.

There are 2 vendor types that can connect you to Azure using ExpressRoute:

  • Exchange provider: Has an ExpressRoute circuit in their data centre. Either you run your “on-premises” in their data centre or you connect to their data centre.
  • Network service provider: You get a connection to an ISP and they relay you to a Microsoft edge data centre or POP.

The locations of ExpressRoute and Azure are often confused. A connection using ExpressRoute, at a very high level and from your perspective, has three pieces:

  • Circuit: A connection to a Microsoft edge data centre or pop. This can be one of many global locations that are often nothing to do with Azure regions; they are connected to the same Microsoft WAN as Azure (and Microsoft 365) and are a means to relay you to Azure (or Microsoft 365) using Azure ExpressRoute.
  • Connection: Connecting an Azure Virtual Network (ExpressRoute Gateway) in an Azure region to a circuit that terminates at the edge data centre or POP.
  • Peering: Configuring the routing across the circuit and connection.

For example, a customer in Eindhoven, Netherlands might have an ExpressRoute circuit that connects to “Amsterdam”; This POP or edge data centre is probably in Amsterdam, Netherlands, or the suburbs. The customer might use that circuit to connect to Azure West Europe, colloquially called “Amsterdam”, but is actually in Middenmeer, approximately 60 KM north of Amsterdam.

ExpressRoute Versus VPN

The choice between ExpressRoute and site-to-site VPN isn’t always as clear-cut as one might think: “big organisations go with ExpressRoute and small/mid go with VPN”. Very often, organisations are choosing to access Azure services over the Internet using HTTPS, with small amounts of legacy traffic traversing a private connection. In this case, VPN is perfect. But when you want an SLA or low latency, ExpressRoute is your choice.

Site-to-Site VPN ExpressRoute
Microsoft SLA Microsoft: Azure

Internet: No one

Microsoft: Azure

Service Provider: Circuit

Max bandwidth Aggregate of 10 Gbps 100 Gbps
Routing BGP (even if you don’t use/enable it) BGP
Latency Internet Low
Multi-Site See SD-WAN (Azure Virtual WAN) Global Reach

Also see Azure Virtual WAN

Connections Azure Virtual Networks Azure Virtual Networks

Other Azure Services

Microsoft 365

Dynamics 365

Other clouds, depending on service provider

Payment Outbound data transfer and your regular Internet connection Payment to service provider for the circuit.

Payment for either a metered (outbound data + circuit) or unlimited data (circuit) to Microsoft.

Terminology

  • Customer premises equipment (CPE) or Customer edge routers (CEs): 2, ideally, edge devices that will be connected in a highly available way to 2 lines connecting your network(s) to the service provider.
  • Provider edge routers (PEs), CE facing: Routers or switches operated by the service provider that the customer is connected to.
  • Provider edge routers (PEs), MSEE facing: Routers or switches operated by the service provider that connect to Microsoft’s MSEEs.
  • Microsoft Enterprise Edge (MSEE) routers: Routers in the Microsoft POP or edge data centre that the service provider has connected to.

The MSEE is what:

  • Your ExpressRoute virtual network gateway connects to.
  • Propagates BGP routes to your virtual network.
  • Can connect two virtual networks together (with BGP propagation) if they both connect to the same circuit (MSEE).
  • Can relay you to other Azure services or other Microsoft cloud services.

It is very strongly recommended that the customer deploys two highly available pieces of hardware for the CEs. The ExpressRoute virtual network gateway is also HA, but if the Azure region supports it, spread the two nodes across different availability zones for a higher level of availability.

FYI, these POPs or Edge Data Centers also host other Azure services for edge services.

Peering

Quite often, the primary use case for Azure ExpressRoute is to connect to Azure virtual networks, and resources connected to those virtual networks such as:

  • Virtual machines
  • VNet integrated SKUs such as App Service Environment, API Management, and SQL Managed Instance
  • Platform services supporting Private Endpoint

That connectivity is provided by Azure Private Peering. However, you can also connect to other Microsoft services using Microsoft Peering:

To use Microsoft Peering you will need to configure NAT to convert connections from private IP addresses to public IP addresses before they enter the Microsoft network.

ExpressRoute And VPN

There are two scenarios where ExpressRoute and site-to-site VPN can coexist to connect the same on-premises network and virtual network.

The first is for failover. If you deploy a /27 or larger GatewaySubnet then that subnet can contain an ExpressRoute Virtual Network Gateway and a VPN Virtual Network Gateway. You can then configure ExpressRoute and VPN to connect the same on-premises and Azure networks. The scenario here is that the VPN tunnel will be an automated failover connection for the ExpressRoute circuit – failover will happen automatically with less than 10 packets being lost. Two things immediately come to mind:

  • Use a different ISP for Internet/VPN connection than used for ExpressRoute
  • Both connections must propagate the same on-premises networks.

An interesting new twist was announced recently for Virtual Network Gateway and Azure Virtual WAN. By default, there is no encryption on your ExpressRoute circuit (more on this later). You will be able to initiate a site-to-site VPN connection across the ExpressRoute circuit to a VPN Virtual Network Gateway that is in the same GatewaySubnet as the ExpressRoute Virtual Network Gateway, encrypting your traffic.

ExpressRoute Tiers

There are three tiers of ExpressRoute circuit that you can deploy in Microsoft Azure. I have not found a good comparison table, so the below will not be complete:

Standard Premium
Price Normal More Expensive
Azure Virtual WAN support Announced, not GA GA
Azure Global Reach Limited to same geo-zone All regions
Max connections per circuit 10 100, depending on the circuit size (Mbps) – 20 for 50 Mbps, 100 for 10 Gbps+
Connections from different subscriptions No Yes
Max routes advertised Private peering: 4,000

Microsoft peering: 200

Private Peering: Up to 10,000

Microsoft peering: 200

I said “three tiers”, right? But there is also a third tier called Local which is very lightly documented. ExpressRoute Local is a subset of ExpressRoute Standard where:

  • The circuit can only connect to 1 or 2 Azure regions in the same metro as the POP or edge data centre. Therefore it is available in fewer locations than ExpressRoute Standard.
  • ExpressRoute Global Reach is not available.
  • It requires an unlimited data plan with at least 1 Gbps, coming in at ~25% of the price of a 1 Gbps Standard tier unlimited data plan.

Service Provider Types

There are three ways that a service provider can connect you to Azure using ExpressRoute, with two of them being:

  • Layer-2: A VLAN is stretched from your on-premises network to Azure
  • Layer-3: You connect to Azure over IP VPN or MPLS VPN. Your on-premises network connects either by BGP or a static default route.

There is a third option, called ExpressRoute Direct.

ExpressRoute Direct

A subset of the Microsoft POPs or edge data centres offer a third kind of connection for Azure ExpressRoute called ExpressRoute Direct. The features of this include:

  • Larger sizes: You can have sizes from 1 Gbps to 100 Gbps for massive data ingestion, for things like Cosmos DB or storage (HPC).
  • Physical Isolation: Some organisations will have a compliance reason to avoid connections to shared network equipment (the CEs and MSEE).
  • Granular control of circuit distribution: Based on business unit

This is a very specialised SKU that you must apply to use.

ExpressRoute FastPath

The normal flow of packets routing into Azure over ExpressRoute is:

  1. Enter Microsoft at the MSEE
  2. Travel via the ExpressRoute Virtual Network Gateway.
  3. If a route table exists, follow that route, for example, to a hub-based firewall.
  4. Route to the NIC of the virtual machine

There is a tiny latency penalty by routing through the Virtual Network Gateway. For a tiny percentage of customers, this latency may cause issues.

The concept of ExpressRoute Fast Path is that you can skip the hop of the virtual network gateway and route directly to the NICs of the virtual machines (in the same virtual network as the gateway).

To use this feature you must be using one of these gateway sizes:

  • Ultra Performance
  • ErGw3AZ

The following are not supported and will force traffic to route via the ExpressRoute Virtual Network Gateway:

  • There is a UDR on the GatewaySubnet
  • Virtual Network Peering is used. An alternative is to connect the otherwise-peered VNets directly to the circuit with their own VNet Gateway.
  • You use a Basic Load Balancer in front of the VMs; use a Standard tier Load Balancer.
  • You are attempting to connect to Private Endpoint.

ExpressRoute Global Reach

I think that ExpressRoute Global Reach is one of the more interesting features in ExpressRoute. You can have two or more offices, each with their own ExpressRoute (not Local tier) circuit to a local POP/edge data center, and enable Global Reach to allow:

  • The offices to connect to Azure/Microsoft cloud resources
  • Connect to each other over the Microsoft WAN instead of deploying a WAN

Note that ExpressRoute Standard will support connecting locations in the same geo-zone, and ExpressRoute Premium will support all geo-zones. Supported POPs are limited to a small subset of locations.

Encryption

Traffic over ExpressRoute is not encrypted and as Edward Snowden informed us, various countries are doing things to sniff traffic. If you wish to protect your traffic you will have to “bring your own key”.  We have a few options:

  • The aforementioned VPN over ExpressRoute, which is available now for Virtual Network Gateway and Azure Virtual WAN.
  • Implement a site-to-site VPN across ExpressRoute using a third-party virtual appliance hosted in the Azure VNet.
  • IPsec configured on each guest OS, limited to machines.
  • MACsec, a Layer-2 feature where you can implement your own encryption from your VE to the MSEE, encrypting all traffic, not just to/from VMs.

The MACsec key is stored securely in Azure Key Vault. From what I can see, MACsec is only available on ExpressRoute Direct. Microsoft claims that it does not cause a performance issue on their routers, but they do warn you to check your CE vendor guidance.

Multi-Cloud

Now you’ll see why I talked about Layer-2 and Layer-3. Depending on your service provider type and their connectivity to non-Microsoft clouds, if you have a circuit with the service provider (from your CEs to their CE facing PEs) that same circuit can be used to connect to Azure over ExpressRoute and to other clouds such as AWS or others. With BGP propagation, you could route from on-premises to/from either cloud, and your deployments in those clouds could route to each other.

Bidirectional Forwarding Detection (BFD)

The circuit is deployed as two connections, ideally connected to 2 CEs in your edge network. Failover is automated, but some will want failover to be as quick as possible. You can reduce the BGP keepalive and hold-time but this will be processor intensive on the network equipment.

A feature called BFD can detect link failure in a sub-second with low overhead. BFD is enabled on “newly” created ExpressRoute private peering interfaces on the MSEEs – you can reset the peering if required. If you want this feature then you need to enable it on your CEs – the service provider must also enable it on their PEs.

Monitoring

Azure Monitor provides a bunch of metrics for ExpressRoute that you can visualise or create alerts on.

Azure’s Connection Monitor is the Microsoft-offered solution for monitoring an ExpressRoute connection. The idea is that a Log Analytics agent (Windows or Linux) is deployed onto one or more always-on on-premises machines. A test is configured to run across the circuit measuring availability and performance.

 

12 thoughts on “An Introduction to Azure ExpressRoute Architecture”

    1. That’s a good question and link. I was lead to believe that this “we’re announcing at Ignite” feature was still in private preview. But that article (in the vWAN docs) makes me wonder if it is publicly available. I dislike “we’re announcing at Ignite” verbal statements because they don’t tell us the release status or ETA. I would have expected that doc to be in the ExpressRoute section, not in vWAN.

  1. Great Article.
    One small comment, while true that VPN can’t connect directly to Azure services (outside of a Vnet) you can now implement private endpoints as a way of accessing Azure services via the Vnet, hence allowing you to also access them via VPN.

    1. Correct, but there are differences. Private Link does provide private connections to resource but with quite a bit of work per resource. Microsoft Peering just works across all instances, albeit without some of the micro-segmented firewalls you can deploy with Private Endpoint. I would prefer Private Endpoint from a security perspective.

    1. No, but I haven’t looked/asked. I suspect that would come in the form of private endpoint, which every service must eventually have.

  2. I have set up Route based S2SVPN as a failover path of my existing ExpressRoute however once everything configured for S2SVPN, icmp packet from on-prem to Azure subnet does not come back to on-prem.

    When On-prem PC sends icmp packet to Azure VM, packets are forwarded to the ISP MPLS network.
    When Azure VM sends icmp packet to On-prem PC, packets are forwarded over S2SVPN.
    When I remove the on-prem subnet from the local network gateway, everything works over MPLS.

    I suspect either on-prem originated packets are dropped somewhere in the ISP MPLS network or asymmetric routing messes up communication between on-prem and Azure subnet…

    I really do not see documentations that explains how Azure prefers expressroute over S2SVPN 🙁

    1. Azure just prefers ExpressRoute. It is documented – I just don’t have the page handy. You must be sure that route propagation is identical on both lines if you want up/down to be exclusively on ExpressRoute. If you need some on-prem networks not to use ExpressRoute then you need to do static on-prem routing to Azure for the VPN and not propagate those networks to Azure via ExpressRoute BGP.

Leave a Reply to Gil GRoss Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.