In this post, I will explain how you can connect multiple Azure hub-and-spoke (virtual data centre) deployments together using Azure networking, even across different Azure regions.
There is a lot to know here so here is some recommended reading that I previously published:
If you are using Azure Virtual WAN Hub then some stuff will be different and that scenario is not covered fully here – Azure Virtual WAN Hub has a preview (today) feature for Any-to-Any routing.
The Scenario
In this case, there are two hub-and-spoke deployments:
- Blue: Multiple virtual networks covered by the CIDR of 10.1.0.0/16
- Green: Another set of multiple virtual networks covered by the CIDR of 10.2.0.0/16
I’m being strategic with the addressing of each hub-and-spoke deployment, ensuring that a single CIDR will include the hub and all spokes of a single deployment – this will come in handy when we look at User-Defined Routes.
Either of these hub-and-spoke deployments could be in the same region or even in different Azure regions. It is desired that if:
- Any spoke wishes to talk to another spoke it will route through the local firewall in the local hub.
- All traffic coming into a spoke from an outside source, such as the other hub-and-spoke, must route through the local firewall in the local hub.
That would mean that Spoke 1 must route through Hub 1 and then Hub 2 to talk to Spoke 4. The firewall can be a third-party appliance or the Azure Firewall.
Core Routing
Each subnet in each spoke needs a route to the outside world (0.0.0.0/0) via the local firewall. For example:
- The Blue firewall backend/private IP address is 10.1.0.132
- A Route Table for each subnet is created in the Blue deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.1.0.132
- The Greenfirewall backend/private IP address is 10.2.0.132
- A Route Table for each subnet is created in the Green deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.2.0.132
Note: Some network-connected PaaS services, e.g. API Management or SQL Managed Instance, require additional routes to the “control plane” that will bypass the local firewall.
Site-to-Site VPN
In this scenario, the organisation is connecting on-premises networks to 1 or more of the hub-and-spoke deployments with a site-to-site VPN connection. That connection goes to the hub of Blue and to Green hubs.
To connect Blue and Green you will need to configure VNet Peering, which can work inside a region or across regions (using Microsoft’s low latency WAN, the second-largest private WAN on the planet). Each end of peering needs the following settings (the names of the settings change so I’m not checking their exact naming):
- Enabled: Yes
- Allow Transit: Yes
- Use Remote Gateway: No
- Allow Gateway Sharing: No
Let’s go back and do some routing theory!
That peering connection will add a hidden Default (“system”) route to each subnet in the hub subnets:
- Blue hub subnets: A route to 10.2.0.0/24
- Green hub subnets: A route to 10.1.0.0/24
Now imagine you are a packet in Spoke 1 trying to get to Spoke 4. You’re sent to the firewall in Blue Hub 1. The firewall lets the traffic out (if a rule allows it) and now the packet sits in the egress/frontend/firewall subnet and is trying to find a route to 10.2.2.0/24. The peering-created Default route covers 10.2.0.0/24 but not the subnet for Spoke 4. So that means the default route to 0.0.0.0/0 (Internet) will be used and the packet is lost.
To fix this you will need to add a Route Table to the egress/frontend/firewall subnet in each hub:
- Blue firewall subnet Route Table: 10.2.0.0/16 via virtual appliance 10.2.0.132
- Red firewall subnet Route Table: 10.1.0.0/16 via virtual appliance 10.1.0.132
Thanks to my clever addressing of each hub-and-spoke, a single route will cover all packets leaving Blue and trying to get to any spoke in Red and vice-versa.
ExpressRoute
Now the customer has decided to use ExpressRoute to connect to Azure – Sweet! But guess what – you don’t need 1 expensive circuit to each hub-and-spoke.
You can share a single circuit across multiple ExpressRoute gateways:
- ExpressRoute Standard: Up to 10 simultaneous connections to Virtual Network Gateways in 1+ regions in the same geopolitical region.
- ExpressRoute Premium: Up to 100 simultaneous connections to Virtual Network Gateways in 1+ regions in any geopolitical region.
FYI, ExpressRoute connections to the Azure Virtual WAN Hub must be of the Premium SKU.
ExpressRoute is powered by BGP. All the on-premises routes that are advertised propagate through the ISP to the Microsoft edge router (“meet-me”) in the edge data centre. For example, if I want an ExpressRoute circuit to Azure West Europe (Middenmeer, Netherlands – not Amsterdam) I will probably (not always) get a circuit to the POP or edge data centre in Amsterdam. That gets me a physical low-latency connection onto the Microsoft WAN – and my BGP routes get to the meet-me router in Amsterdam. Now I can route to locations on that WAN. If I connect a VNet Gateway to that circuit to Blue in Azure West Europe, then my BGP routes will propagate from the meet-me router to the GatewaySubnet in the Blue hub, and then on to my firewall subnet.
BGP propagation is disabled in the spoke Route Tables to ensure all outbound flows go through the local firewall.
But that is not the extent of things! The hub-and-spoke peering connections allow Gateway Sharing from the hub and Use Remote Gateway from the spoke. With that configuration, BGP routes to the spoke get propagated to the GatewaySubnet in the hub, then to the meet-me router, through the ISP and then to the on-premises network. This is what our solution is based on.
Let’s imagine that the Green deployment is in North Europe (Dublin, Ireland). I could get a second ExpressRoute connection but:
- That will add cost
- Not give me the clever solution that I want – but I could work around that with ExpressRoute Global Reach
I’m going to keep this simple – by the way, if I wanted Green to be in a different geopolitical region such as East US 2 then I could use ExpressRoute Premium to make this work.
In the Green hub, the Virtual Network Gateway will connect to the existing ExpressRoute circuit – no more money to the ISP! That means Green will connect to the same meet-me router as Blue. The on-premises routes will get into Green the exact same way as with Blue. And the routes to the Green spokes will also propagate down to on-premises via the meet-me router. That meet-me router knows all about the subnets in Blue and Green. And guess what BGP routers do? They propagate – so, the routes to all of the Blue subnets propagate to Green and vice-versa with the next hop (after the Virtual Network Gateway) being the meet-me router. There are no Route Tables or peering required in the hubs – it just works!
Now the path from Blue Spoke 1 to Green Spoke 4 is Blue Hub Firewall, Blue Virtual Network Gateway, <the Microsoft WAN>, Microsoft (meet-me) Router, <the Microsoft WAN>, Green Virtual Network Gateway, Green Hub Firewall, Green Spoke 4.
There are ways to make this scenario more interesting. Let’s say I have an office in London and I want to use Microsoft Azure. Some stuff will reside in UK South for compliance or performance reasons. But UK South is not a “hero region” as Microsoft calls them. There might be more advanced features that I want to use that are only in West Europe. I could use two ExpressRoute circuits, one to UK South and one to West Europe. Or I could set up a single circuit to London to get me onto the Microsoft WAN and connected this circuit to both of my deployments in UK South and West Europe. I have a quicker route going Office > ISP > London edge data center > Azure West Europe than from Office > ISP > Amsterdam edge data center > Azure West Europe because I have reduced the latency between me and West Europe by reducing the length of the ISP circuit and using the more-direct Microsoft WAN. Just like with Azure Front Door, you want to get onto the Microsoft WAN as quickly as possible and let it get you to your destination as quickly as possible.
Great post as usual! You’ve mixed up the colors in the site-to-site section though, you’ve written red instead of green in two places.
Thank you Aidan, great blog article ! I would be interested to follow up with you on the inter-region connectivity design for hubs residing in different geopolitical regions (let’s say Europe and Asia). You say ExpressRoute Premium will allow you to hook up a VNet with the circuit in that other region and through that connect to on-premise and also to the other hub. Are there any benefits/downsides using this and not Global VNet peering ? Thanks, Philipp
Thanks Philip. ExpressRoute is the winner really. If you are using ExpressRoute at all, I’m told (by Microsoft) that peering by sharing the circuit will be cheaper than Global VNet Peering. Also, there no real configuration to manage. If you use site-to-site VPN, then you have no choice – you have to use Global VNet Peering and control the UDRs. Of course, site-to-site VPN is cheaper than ExpressRoute but you might want to monitor the overall networking costs over time to see if there is a tipping point where a shared ExpressRoute circuit might be cheaper. In all cases, site-to-site networking will have lower latency with ExpressRoute than with VPN.
I understand that you can connect the express route circuit to the hubs in each region and have all routes propagated. Traffic flows from on premise to either region, as well as between regions through the express route connection. We have this setup and it works well.
However when we introduced an AZFW in each region and wanted to force traffic from spoke 1 in region A to spoke 2 in region B through both FW’s, we ultimately resulted in implementing global vNet peering between the Hubs and UDRs on the firewall subnets (with next hops defined as the cross regions FW IP) to get it to work. (As of this time we do not have a UDR on the gateway subnet forcing traffic through the AZFW)
“There are no Route Tables or peering required in the hubs – it just works!”
In your above setup would you not at-least need to apply a UDR to the gateway subnets to force traffic egressing from the gateway to go through the firewall?
Perhaps I’m missing something but the only ways I could get this to work was to globally peer the hubs and implement UDRs on the firewall subnets, or to rely on express route by leaving next hops as express route gateway IPs and then applying UDR’s on gateway subnets for force traffic through AZFW (Perhaps that is simply implied but not discussed in your post?)
Thanks for any clarification, another great post. Look forward to your online training, though this will be an early morning for myself (MST, 2am start time).
Yes, you have to place a route table on the gateway subnet and create a UDR for each spoke CIDR to point at the firewall private IP address. The “it just works” part is just getting the flows from one hub & spoke to another hub & spoke. I’ve been meaning to blog about the gateway route table topic, somethign that I am covering in my class on July 30th http://cloudmechanix.com/training-courses/july2020-azure-network-security/ 🙂
Adian, with Global vnet peering now available, in the Scenario drawing at the top of the page, where the red dotted line desired connection is, can that now be made with a Global vnet peering between the blue and green hubs?
Correct. I’ve done it. You need to be VERY careful with user-defined routes in the hubs to force traffic through any hub-based firewalls. Under the covers, global VNet Peering is what is probably happening in Azure vWAN too.
How about connecting office site1 to office site2 through azure. I understand that kind of transit is not allowed in express route.
Actually, ExpressRoute Global Reach allows you to use the Microsoft WAN as your WAN. If you use Azure Virtual WAN you can do any-any (including branch-branch) connectivity through a software-defined WAN. And people have used site-to-site VPN with routing appliances to transit the planet using the Microsoft WAN.
I am trying to connet us East Hub to Tokyo region.
Do I need a express route premium? Can I not do this if we have only site to site VPN at both locations?
Is it mandatory to use ExpressRoute premium to be able to peer Hubs in different regions?
Yes.
Hi, thanks for a good post. Trying to wrap my head around Hub-Spoke and are trying to figure something out:
Is there a way to make spoke4 find the way to spoke1 via the hubs without a nva/Firewall on the in the hub network?
No. You must add a router (firewall) in the hub unless you create a mess … err … mesh of VNet peering.
what about using vnet-to-vnet VPN connections with BGP enabled in the hubs instead of vnet peering, would that remove the need of a router in the hubs and add transitivity?
That will result in slower inter-hub routing, limited to the speeds of the gateways.
Hi Aidan, great article as per usual.
In the hub-hub connection using ExpressRoute scenario, is the inter-region traffic limited to the bandwidth of the ExpressRoute circuit?
A client I work with asked this question recently as they want SQL replication traffic between regions, which requires high bandwidth. Their ExpressRoute is 100Mbps, so would their hub-hub traffic over the ExpressRoute be limited by this, or is the bottleneck for this 100Mbps at the on-premises end?
Cheers,
Archie
Good question – and I do not have that answer. If they want to push a maximum throughput, then peering the hubs might be the way to go. Keep in mind that the firewall/hub may end up being the choke point, especially if they use a third-party firewall NVA.
Aidan I can now answer this question as we implemented this topology and have thoroughly tested.
Despite having a 100Mbps ExpressRoute circuit, we were able to get 3Gbps+ bandwidth between VMs in spokes in different regions. From digging a bit deeper I believe the 3Gbps limit is actually at the VM NIC level. Microsoft want customers to purchase higher SKU VMs for higher bandwidth, but fortunately 3Gbps is more than enough for SQL replication.
Good to know that Microsoft don’t restrict your bandwidth when using ExpressRoute for transit.
Thanks, Archie – it looks like the limit of the circuit is implemented only at the MSEE router “external interface”, which is good. Therefore, we get expected speeds from resources through inter-hub connections.
Archie and Aidan, I tested this recently for some of my use-cases and based on my observations (https://azure.cybergav.in/azure-cross-region-network-testing), cross-region connectivity via the ExpressRoute gateways and MSEE routers seems to be limited by the bandwidth of the ExpressRoute circuit. I wouldn’t expect Microsoft to allow us to pay for and use a specific ExpressRoute Circuit bandwidth for on-premises Azure traffic but remove the bandwidth cap (on their MSEE routers) for ER transit between regions. Also, note that the ExpressRoute gateway (depending on SKU) has its own throughput limitations (https://docs.microsoft.com/en-us/azure/expressroute/expressroute-about-virtual-network-gateways) . So, circuit bandwidth aside, Archie would have had to use the Ultra-Performance SKU for the ExpressRoute gateway to push 3 Gbps.
Hi Aidan,
I have a existing setup with two express route circuits in Europe & US regions. We are using Hub & Spoke model and Firewall is deployed in both the HUB vnets. Both the Hub vnets are connect with each other using next hop as meet me router.
Now, I have to deploy a new HUB vnet in third region with S2S VPN gateway. How can I peer the new Hub vnet (S2S) with the other two HUB Vnets.
You’ll have to peer the new hub with the old ones and enable route tables in the AzureFirewallSubnet of each of the three hubs to allow routing between old and new.
Very good article. Still I want te clearify something.
I have 2 Hub and spoke networks like in you show in your picture.
I want to connect from a VM in Spoke Blue to a VM in Spoke green.
2 HUB networks are connected via peering. The are in the same region but in different subscriptions.
Do I need a router or firewall as a default gateway in both HUB networks to route the traffice based on the routing table?
Or is a routing table sufficient?
Or am I totally thinking the wrong way ? 🙂
Best regards,
Ron
If you want 2 spokes to talk to each other, whether connected to the same hub or different hubs, you need some form of router. Azure Virtual Network Manager manages this in a single hub scenario but it is very limited at the moment – not ready for GA usage IMO. You need a next hop, some kind of appliance in each hub, with a UDR in the spoke saying “to get to X you must take a next hop on local appliance A”.