Connecting Azure Hub-And-Spoke Architectures Together

In this post, I will explain how you can connect multiple Azure hub-and-spoke (virtual data centre) deployments together using Azure networking, even across different Azure regions.

There is a lot to know here so here is some recommended reading that I previously published:

If you are using Azure Virtual WAN Hub then some stuff will be different and that scenario is not covered fully here – Azure Virtual WAN Hub has a preview (today) feature for Any-to-Any routing.

The Scenario

In this case, there are two hub-and-spoke deployments:

  • Blue: Multiple virtual networks covered by the CIDR of 10.1.0.0/16
  • Green: Another set of multiple virtual networks covered by the CIDR of 10.2.0.0/16

I’m being strategic with the addressing of each hub-and-spoke deployment, ensuring that a single CIDR will include the hub and all spokes of a single deployment – this will come in handy when we look at User-Defined Routes.

Either of these hub-and-spoke deployments could be in the same region or even in different Azure regions. It is desired that if:

  • Any spoke wishes to talk to another spoke it will route through the local firewall in the local hub.
  • All traffic coming into a spoke from an outside source, such as the other hub-and-spoke, must route through the local firewall in the local hub.

That would mean that Spoke 1 must route through Hub 1 and then Hub 2 to talk to Spoke 4. The firewall can be a third-party appliance or the Azure Firewall.

Core Routing

Each subnet in each spoke needs a route to the outside world (0.0.0.0/0) via the local firewall. For example:

  • The Blue firewall backend/private IP address is 10.1.0.132
  • A Route Table for each subnet is created in the Blue deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.1.0.132
  • The Greenfirewall backend/private IP address is 10.2.0.132
  • A Route Table for each subnet is created in the Green deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.2.0.132

Note: Some network-connected PaaS services, e.g. API Management or SQL Managed Instance, require additional routes to the “control plane” that will bypass the local firewall.

Site-to-Site VPN

In this scenario, the organisation is connecting on-premises networks to 1 or more of the hub-and-spoke deployments with a site-to-site VPN connection. That connection goes to the hub of Blue and to Green hubs.

To connect Blue and Green you will need to configure VNet Peering, which can work inside a region or across regions (using Microsoft’s low latency WAN, the second-largest private WAN on the planet). Each end of peering needs the following settings (the names of the settings change so I’m not checking their exact naming):

  • Enabled: Yes
  • Allow Transit: Yes
  • Use Remote Gateway: No
  • Allow Gateway Sharing: No

Let’s go back and do some routing theory!

That peering connection will add a hidden Default (“system”) route to each subnet in the hub subnets:

  • Blue hub subnets: A route to 10.2.0.0/24
  • Green hub subnets: A route to 10.1.0.0/24

Now imagine you are a packet in Spoke 1 trying to get to Spoke 4. You’re sent to the firewall in Blue Hub 1. The firewall lets the traffic out (if a rule allows it) and now the packet sits in the egress/frontend/firewall subnet and is trying to find a route to 10.2.2.0/24. The peering-created Default route covers 10.2.0.0/24 but not the subnet for Spoke 4. So that means the default route to 0.0.0.0/0 (Internet) will be used and the packet is lost.

To fix this you will need to add a Route Table to the egress/frontend/firewall subnet in each hub:

  • Blue firewall subnet Route Table: 10.2.0.0/16 via virtual appliance 10.2.0.132
  • Red firewall subnet Route Table: 10.1.0.0/16 via virtual appliance 10.1.0.132

Thanks to my clever addressing of each hub-and-spoke, a single route will cover all packets leaving Blue and trying to get to any spoke in Red and vice-versa.

ExpressRoute

Now the customer has decided to use ExpressRoute to connect to Azure – Sweet! But guess what – you don’t need 1 expensive circuit to each hub-and-spoke.

You can share a single circuit across multiple ExpressRoute gateways:

  • ExpressRoute Standard: Up to 10 simultaneous connections to Virtual Network Gateways in 1+ regions in the same geopolitical region.
  • ExpressRoute Premium: Up to 100 simultaneous connections to Virtual Network Gateways in 1+ regions in any geopolitical region.

FYI, ExpressRoute connections to the Azure Virtual WAN Hub must be of the Premium SKU.

ExpressRoute is powered by BGP. All the on-premises routes that are advertised propagate through the ISP to the Microsoft edge router (“meet-me”) in the edge data centre. For example, if I want an ExpressRoute circuit to Azure West Europe (Middenmeer, Netherlands – not Amsterdam) I will probably (not always) get a circuit to the POP or edge data centre in Amsterdam. That gets me a physical low-latency connection onto the Microsoft WAN – and my BGP routes get to the meet-me router in Amsterdam. Now I can route to locations on that WAN. If I connect a VNet Gateway to that circuit to Blue in Azure West Europe, then my BGP routes will propagate from the meet-me router to the GatewaySubnet in the Blue hub, and then on to my firewall subnet.

BGP propagation is disabled in the spoke Route Tables to ensure all outbound flows go through the local firewall.

But that is not the extent of things! The hub-and-spoke peering connections allow Gateway Sharing from the hub and Use Remote Gateway from the spoke. With that configuration, BGP routes to the spoke get propagated to the GatewaySubnet in the hub, then to the meet-me router, through the ISP and then to the on-premises network. This is what our solution is based on.

Let’s imagine that the Green deployment is in North Europe (Dublin, Ireland). I could get a second ExpressRoute connection but:

  • That will add cost
  • Not give me the clever solution that I want – but I could work around that with ExpressRoute Global Reach

I’m going to keep this simple – by the way, if I wanted Green to be in a different geopolitical region such as East US 2 then I could use ExpressRoute Premium to make this work.

In the Green hub, the Virtual Network Gateway will connect to the existing ExpressRoute circuit – no more money to the ISP! That means Green will connect to the same meet-me router as Blue. The on-premises routes will get into Green the exact same way as with Blue. And the routes to the Green spokes will also propagate down to on-premises via the meet-me router. That meet-me router knows all about the subnets in Blue and Green. And guess what BGP routers do? They propagate – so, the routes to all of the Blue subnets propagate to Green and vice-versa with the next hop (after the Virtual Network Gateway) being the meet-me router. There are no Route Tables or peering required in the hubs – it just works!

Now the path from Blue Spoke 1 to Green Spoke 4 is Blue Hub Firewall, Blue Virtual Network Gateway, <the Microsoft WAN>, Microsoft (meet-me) Router, <the Microsoft WAN>, Green Virtual Network Gateway, Green Hub Firewall, Green Spoke 4.

There are ways to make this scenario more interesting. Let’s say I have an office in London and I want to use Microsoft Azure. Some stuff will reside in UK South for compliance or performance reasons. But UK South is not a “hero region” as Microsoft calls them. There might be more advanced features that I want to use that are only in West Europe. I could use two ExpressRoute circuits, one to UK South and one to West Europe. Or I could set up a single circuit to London to get me onto the Microsoft WAN and connected this circuit to both of my deployments in UK South and West Europe. I have a quicker route going Office > ISP > London edge data center > Azure West Europe than from Office > ISP > Amsterdam edge data center > Azure West Europe because I have reduced the latency between me and West Europe by reducing the length of the ISP circuit and using the more-direct Microsoft WAN. Just like with Azure Front Door, you want to get onto the Microsoft WAN as quickly as possible and let it get you to your destination as quickly as possible.

31 thoughts on “Connecting Azure Hub-And-Spoke Architectures Together”

  1. Thank you Aidan, great blog article ! I would be interested to follow up with you on the inter-region connectivity design for hubs residing in different geopolitical regions (let’s say Europe and Asia). You say ExpressRoute Premium will allow you to hook up a VNet with the circuit in that other region and through that connect to on-premise and also to the other hub. Are there any benefits/downsides using this and not Global VNet peering ? Thanks, Philipp

    1. Thanks Philip. ExpressRoute is the winner really. If you are using ExpressRoute at all, I’m told (by Microsoft) that peering by sharing the circuit will be cheaper than Global VNet Peering. Also, there no real configuration to manage. If you use site-to-site VPN, then you have no choice – you have to use Global VNet Peering and control the UDRs. Of course, site-to-site VPN is cheaper than ExpressRoute but you might want to monitor the overall networking costs over time to see if there is a tipping point where a shared ExpressRoute circuit might be cheaper. In all cases, site-to-site networking will have lower latency with ExpressRoute than with VPN.

  2. I understand that you can connect the express route circuit to the hubs in each region and have all routes propagated. Traffic flows from on premise to either region, as well as between regions through the express route connection. We have this setup and it works well.

    However when we introduced an AZFW in each region and wanted to force traffic from spoke 1 in region A to spoke 2 in region B through both FW’s, we ultimately resulted in implementing global vNet peering between the Hubs and UDRs on the firewall subnets (with next hops defined as the cross regions FW IP) to get it to work. (As of this time we do not have a UDR on the gateway subnet forcing traffic through the AZFW)

    “There are no Route Tables or peering required in the hubs – it just works!”
    In your above setup would you not at-least need to apply a UDR to the gateway subnets to force traffic egressing from the gateway to go through the firewall?

    Perhaps I’m missing something but the only ways I could get this to work was to globally peer the hubs and implement UDRs on the firewall subnets, or to rely on express route by leaving next hops as express route gateway IPs and then applying UDR’s on gateway subnets for force traffic through AZFW (Perhaps that is simply implied but not discussed in your post?)

    Thanks for any clarification, another great post. Look forward to your online training, though this will be an early morning for myself (MST, 2am start time).

      1. Adian, with Global vnet peering now available, in the Scenario drawing at the top of the page, where the red dotted line desired connection is, can that now be made with a Global vnet peering between the blue and green hubs?

        1. Correct. I’ve done it. You need to be VERY careful with user-defined routes in the hubs to force traffic through any hub-based firewalls. Under the covers, global VNet Peering is what is probably happening in Azure vWAN too.

  3. How about connecting office site1 to office site2 through azure. I understand that kind of transit is not allowed in express route.

    1. Actually, ExpressRoute Global Reach allows you to use the Microsoft WAN as your WAN. If you use Azure Virtual WAN you can do any-any (including branch-branch) connectivity through a software-defined WAN. And people have used site-to-site VPN with routing appliances to transit the planet using the Microsoft WAN.

  4. I am trying to connet us East Hub to Tokyo region.
    Do I need a express route premium? Can I not do this if we have only site to site VPN at both locations?

    Is it mandatory to use ExpressRoute premium to be able to peer Hubs in different regions?

  5. Hi, thanks for a good post. Trying to wrap my head around Hub-Spoke and are trying to figure something out:

    Is there a way to make spoke4 find the way to spoke1 via the hubs without a nva/Firewall on the in the hub network?

      1. what about using vnet-to-vnet VPN connections with BGP enabled in the hubs instead of vnet peering, would that remove the need of a router in the hubs and add transitivity?

  6. Hi Aidan, great article as per usual.

    In the hub-hub connection using ExpressRoute scenario, is the inter-region traffic limited to the bandwidth of the ExpressRoute circuit?

    A client I work with asked this question recently as they want SQL replication traffic between regions, which requires high bandwidth. Their ExpressRoute is 100Mbps, so would their hub-hub traffic over the ExpressRoute be limited by this, or is the bottleneck for this 100Mbps at the on-premises end?

    Cheers,

    Archie

    1. Good question – and I do not have that answer. If they want to push a maximum throughput, then peering the hubs might be the way to go. Keep in mind that the firewall/hub may end up being the choke point, especially if they use a third-party firewall NVA.

      1. Aidan I can now answer this question as we implemented this topology and have thoroughly tested.
        Despite having a 100Mbps ExpressRoute circuit, we were able to get 3Gbps+ bandwidth between VMs in spokes in different regions. From digging a bit deeper I believe the 3Gbps limit is actually at the VM NIC level. Microsoft want customers to purchase higher SKU VMs for higher bandwidth, but fortunately 3Gbps is more than enough for SQL replication.

        Good to know that Microsoft don’t restrict your bandwidth when using ExpressRoute for transit.

        1. Thanks, Archie – it looks like the limit of the circuit is implemented only at the MSEE router “external interface”, which is good. Therefore, we get expected speeds from resources through inter-hub connections.

          1. Archie and Aidan, I tested this recently for some of my use-cases and based on my observations (https://azure.cybergav.in/azure-cross-region-network-testing), cross-region connectivity via the ExpressRoute gateways and MSEE routers seems to be limited by the bandwidth of the ExpressRoute circuit. I wouldn’t expect Microsoft to allow us to pay for and use a specific ExpressRoute Circuit bandwidth for on-premises Azure traffic but remove the bandwidth cap (on their MSEE routers) for ER transit between regions. Also, note that the ExpressRoute gateway (depending on SKU) has its own throughput limitations (https://docs.microsoft.com/en-us/azure/expressroute/expressroute-about-virtual-network-gateways) . So, circuit bandwidth aside, Archie would have had to use the Ultra-Performance SKU for the ExpressRoute gateway to push 3 Gbps.

  7. Hi Aidan,

    I have a existing setup with two express route circuits in Europe & US regions. We are using Hub & Spoke model and Firewall is deployed in both the HUB vnets. Both the Hub vnets are connect with each other using next hop as meet me router.
    Now, I have to deploy a new HUB vnet in third region with S2S VPN gateway. How can I peer the new Hub vnet (S2S) with the other two HUB Vnets.

    1. You’ll have to peer the new hub with the old ones and enable route tables in the AzureFirewallSubnet of each of the three hubs to allow routing between old and new.

  8. Very good article. Still I want te clearify something.

    I have 2 Hub and spoke networks like in you show in your picture.
    I want to connect from a VM in Spoke Blue to a VM in Spoke green.

    2 HUB networks are connected via peering. The are in the same region but in different subscriptions.

    Do I need a router or firewall as a default gateway in both HUB networks to route the traffice based on the routing table?
    Or is a routing table sufficient?

    Or am I totally thinking the wrong way ? πŸ™‚

    Best regards,

    Ron

    1. If you want 2 spokes to talk to each other, whether connected to the same hub or different hubs, you need some form of router. Azure Virtual Network Manager manages this in a single hub scenario but it is very limited at the moment – not ready for GA usage IMO. You need a next hop, some kind of appliance in each hub, with a UDR in the spoke saying “to get to X you must take a next hop on local appliance A”.

  9. Great article about connectivity! I would like to clarify and ask about the site-to-site scenario between multiple hub.

    From my understanding it’s not possible to use the vnet peering for trasitive connection where there are VPN gateways on both side.

    https://learn.microsoft.com/en-US/azure/virtual-network/virtual-network-troubleshoot-peering-issues#both-the-hub-virtual-network-and-the-spoke-virtual-network-have-a-vpn-gateway

    Do you mean we should use vnet-to-vnet connections between the hub? or do I miss something on the setup?

    1. With multiple hubs, you will be creating a mesh of peering between the hubs. As long as you follow my advice on compute then the routing configuration will be simple.

      In theory, you could configure transitive routing by using a intermediate firewall/router as a next hop. But I’d ask, why do that and make one regional footprint a dependency for reaching a third regional footprint? Plus that is no less complex – one could argue that it’s more complex.

      Follow my advice, and keep it simple and reliable.

      1. Hi Aidan,

        Thanks for the reply, really appreciate the discussion. I am still not clear on the how though, maybe I wasn’t exactly being thorough and detailed on my question/scenario.

        So here’s the scenario:
        We have multiple hub and spoke (for simplicity let’s just say 2): HS1 & HS2, In each hub there’s only Gateway + Firewall (no compute resource)

        Also, each hub is connected to different on-prem environment through site-to-site VPN
        HS1 -> On-prem01
        HS2 -> On-prem02

        So in general, usually spoke from HS1 would talk to On-prem01 and spoke from HS2 would talk to On-prem02

        Occassionally there’s need for spoke from HS1 to talk to Onprem02 (not often, but frequent enough and the same with spoke from HS2)

        I opted for vnet-to-vnet to facilitate the connection between spoke of HS1 to Onprem02 as I was on the impression that with mesh hub (vnet peering between HS1 and HS2) it would not work since both hub has VPN gateway there.

        Would you do it in another way?
        Thanks again

        1. 2 solutions I can think of without putting much time into it:
          1. Add connections, such as S2S VPN, from onprem1 to HS2 and onprem2 to HS1. If the need for comms is there, then match it with the connections. That’s the logical approach.
          2. Add routes. Create UDRs in the firewall subnet in HS2 hub to on-prem1 with a next hop of the firewall in HS1, and vice versa. That’s more messy.

  10. Thanks for writing this article Aidan, very informative about an architecture not well documented!

    Although there is something I am still not clear about… the ExpressRoute ‘resource’ created using Azure portal.

    I understand to complete the ExpressRoute connectivity you need to create an ExpressRoute resource in Azure which has SKU (Local, Standard or Premium) AND a ‘region’ associated with it. Let’s say you created that in Green region as Standard so effectively your On-Prem is connected to Green region then, no?

    Now you have a Vnet Gateway in Blue region to connect, won’t that connect to the ExpressRoute resource created in Green region and from there it will direct the traffic to Microsoft (meet-me) –> on-prem?

    Or I am missing something?

    1. Yup, you’re missing something. The “location” of an ExpressRoute circuit is not a resource location. It is not in an Azure region. It is a peering location – the place where the circuit terminates in the Microsoft WAN – https://learn.microsoft.com/en-us/azure/expressroute/expressroute-locations-providers. It’s actually one of the 170-ish edge data centers. For example, North Europe is in west Dublin in an area called Grange Castle (it’s not secret – it’s on Google Maps). But there are several Dublin locations available for ExpressRoute.

      When you route from on-prem to Azure via ExpressRoute, the ISP brings you to the peering location. The connection from your ER Gateway brings you from that peering location to the ER Gateway.

      In your scenario, Green Location (lets say West Europe in Middenmeer, NL) connects to a Standard ExpressRoute circuit terminating in Amsterdam. But we could also have North Europe (Dublin, IE) connect to the same ER circuit. For an office in the lowlands, this would be ideal because they would have a short hop via the circuit to the Amsterdam location, and then accelerated connectivity to the Azure regions over the MS WAN.

Leave a Reply to AFinn Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.