Site-to-Site VPN

Azure Virtual WAN – Connectivity

In this post, I’ll explain how Azure Virtual WAN offers its core service: connections.

SD-WAN

Some of you might be thinking – this is just for large corporations and I’m outta here. Don’t run just yet. Azure Virtual WAN is a rethinking of how to:

Connect users to Azure services and on-premises at the same time
Connect sites to Azure and (optionally) other sites
Replace the legacy hardware-defined WAN
Connect Azure virtual networks together.

That first point is quite timely – connecting users to services. Work-from-home (WFH) has forced enterprises to find ways to connect users to services no matter where they are. That connectivity was often limited to a privileged few. The pandemic forced small/large organisations to re-think productivity connectivity and to scale out. Before COVID19 struck, I was starting to encounter businesses that were considering (some even starting) to replace their legacy MPLS WAN with a software-defined WAN (SD-WAN) where media of different types, suitable to different kinds of sites/users/services, were aggregated via appliances; this SD-WAN is lower cost, more flexible, and by leveraging local connectivity, enables smaller locations, such as offices or retail outlets, to have an affordable direct connection to the cloud for better performance. How the on-premises part of the SD-WAN is managed is completely up to you; some will take direct control and some will outsource it to a network service provider.

Connections

Azure Virtual WAN is all about connections. When you start to read about the new Custom Routing model in Azure Virtual WAN, you’ll see how route tables are associated with connections. In summary, a connection is a link between an on-premises location (referred to as a branch, even if it’s HQ) or a spoke virtual network with a Hub. And now we need to talk about some Azure resources.

Azure Resources

I’ve provided lots more depth on this topic elsewhere so I will keep this to the basics. There are two core resources in Azure Virtual WAN:

A Virtual WAN
A Hub

The Virtual WAN is a logical resource that provides a global service, although it is actually located in one Azure region. Any hubs that are connected to this Virtual WAN resource can talk to each other (automatically), route it’s connections to another hub’s connections and share resources.

A Virtual WAN Hub is similar to a hub in an Azure hub & spoke architecture. It is a central routing point (with a hidden virtual router) that is the meeting point for any connections to that hub. An Azure region can have 1 hub in your tenant. That means I can have 1 Hub in West Europe and 1 Hub in East US. The Hubs must be connected to an Azure WAN resource; if they share a WAN resource then their connections can talk to each other. I might have all my branches in Europe connect to the Hub in West Europe, and I will connect all my spoke virtual networks in West Europe to the Hub in West Europe too; this means that by default (and I can control this):

The virtual networks can route to each other
The virtual networks can route to the branches
The branches can route to the virtual networks
The branches can route to other branches

We can extend this routing by connecting my branches in North America to the East US Hub and the spoke virtual networks in East US to the East US Hub. Yes; all those North American locations can route to each other. Because the Hubs are connected to a common Virtual WAN, the routing now extends across the Microsoft WAN. That means a retail outlet in the further reaches of northwest rural Ireland can connect to services hosted in East US, via a connection to the Hub in West Europe, and then hopping across the Atlantic Ocean using Microsoft’s low-latency WAN. Nice, right? Even, better – it routes just like that automatically if you are using SD-WAN appliances in the branches.

Note that a managed WAN might wire up that retail outlet differently, but still provide a fairly low-latency connection to the local Hub.

Branch Connections

If you have done any Azure networking then you are probably familiar with:

Site-to-site VPN: Connecting a location with a cost-effective but no-SLA VPN tunnel to Azure.
ExpressRoute: A circuit rented from an ISP for low-latency, high bandwidth, and an SLA-supported private connection to Azure
Point-to-Site VPN: Enabling end-users to create a private VPN tunnel to Azure from their devices while on the move or working from home

Each of the above is enabled in Azure using a Virtual Network Gateway, each running independently. Routing from branch to branch is not an intended purpose. Routing from user to the branch is not an intended purpose. The Virtual Network Gateway’s job is to connect a user to Azure.

The Azure Virtual WAN Hub supports gateways – as hidden resources that must be enabled and configured. All three of the above media types are supported as 3 different types of gateway, sized based on a billing concept called scale units – more scale units means more bandwidth and more cost, with a maximum hub throughput of 40 Gbps (including traffic to/from/between spokes).

Note that a Secured Virtual Hub, featuring the Azure Firewall, has a limit of 30 Gbps if all traffic is routed through that firewall.

You can be flexible with the branch connections. Some locations might be small and have a VPN connection to the Hub. Other locations might require an SLA and use ExpressRoute. Some might require low latency or greater bandwidth and use higher SKUs of ExpressRoute. And of course, some users will be on the move or at home and use P2S VPN. A combination of all 3 connection types can be used at once, providing each location and user the connections and costs that suit them best.

ExpressRoute

You will be using ExpressRoute Standard for Azure Virtual WAN; this is a requirement. I don’t think there’s really too much more to say here – the tech just works once the circuit is up, and a combination of Global Reach and the any-to-any connections/routing of Azure WAN means that things will just work.

The VPN gateway is deployed in an active/active cluster configuration with two public IP addresses. A branch using VPN for connectivity can have:

A single VPN connection over a single ISP connection.
Resilient VPN connections over two ISP connections, ideally with different physical providers or even media types.

An on-premises SD-WAN appliance is strongly recommended for Azure Virtual WAN, but you can use any VPN appliance that is supported for route-based VPN by Microsoft Azure; if you are doing the latter you can use BGP or the Azure WAN alternative to Local Network Gateway-provided prefixes for routing to on-premises.

Point-to-Site (P2S) VPN

The P2S gateway offers a superior service to what you might have observed with the traditional Virtual Network Gateway for VPN. Connectivity from the user device is to a hub with a routing appliance. Any-to-any connectivity treats the user device as a branch, albeit in a dedicated network address space. Once the user has connected the VPN tunnel, they can route to (by default):

Any spoke virtual network connected to the Hub
Any spoke virtual network connected to another Hub on the same Virtual WAN
Any branch office connected to any Hub on the Virtual WAN

In summary, the user is connected to the WAN as a result of being connected to the Hub and is subject to the routing and firewall configurations of that Hub. That’s a pretty nice WFH connectivity solution.

Note that you have support for certificate and RADIUS authentication in P2S VPN, as well as the OpenVPN and Microsoft client.

The Connectivity Experience

Imagine we’re back in normal times again with common business travel. A user in Amsterdam could sit down at their desk in the office and connect to services in West Europe via VPN. They could travel to a small office in Luxembourg and connect to the same services via VPN with no discernible difference. That user could travel to a conference in London and use P2S VPN from their hotel room to connect via the Amsterdam Hub. Now that user might get a jet to Philadelphia, and use their mobile hotspot to offer connectivity to the Azure Virtual WAN Hub in East US via P2S VPN – and the experience is no different!

One concept I would like to try out and get a support statement on is to abstract the IP addresses and locations of the P2S gateways using Azure Traffic Manager so the user only needs to VPN to a single FQDN and is directed (using the performance profile) to the closest (latency) Hub in the Virtual WAN with a P2S gateway.

Simplicity

So much is done for you with Azure Virtual WAN. If you like to click in the Azure Portal, it’s a pretty simple set up to get things going, although security engineering looks to have a steep learning curve with Custom Routing. By default, everything is connected to everything; that’s what a network should do. You shouldn’t have to figure out how to route from A to B. I believe that Azure WAN will offer a superior connectivity solution, even for a single location organisation. That’s why I’ve been spending time figuring this tech out over the last few weeks.

Connecting Azure Hub-And-Spoke Architectures Together

In this post, I will explain how you can connect multiple Azure hub-and-spoke (virtual data centre) deployments together using Azure networking, even across different Azure regions.

There is a lot to know here so here is some recommended reading that I previously published:

If you are using Azure Virtual WAN Hub then some stuff will be different and that scenario is not covered fully here – Azure Virtual WAN Hub has a preview (today) feature for Any-to-Any routing.

The Scenario

In this case, there are two hub-and-spoke deployments:

Blue: Multiple virtual networks covered by the CIDR of 10.1.0.0/16
Green: Another set of multiple virtual networks covered by the CIDR of 10.2.0.0/16

I’m being strategic with the addressing of each hub-and-spoke deployment, ensuring that a single CIDR will include the hub and all spokes of a single deployment – this will come in handy when we look at User-Defined Routes.

Either of these hub-and-spoke deployments could be in the same region or even in different Azure regions. It is desired that if:

Any spoke wishes to talk to another spoke it will route through the local firewall in the local hub.
All traffic coming into a spoke from an outside source, such as the other hub-and-spoke, must route through the local firewall in the local hub.

That would mean that Spoke 1 must route through Hub 1 and then Hub 2 to talk to Spoke 4. The firewall can be a third-party appliance or the Azure Firewall.

Core Routing

Each subnet in each spoke needs a route to the outside world (0.0.0.0/0) via the local firewall. For example:

The Blue firewall backend/private IP address is 10.1.0.132
A Route Table for each subnet is created in the Blue deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.1.0.132
The Greenfirewall backend/private IP address is 10.2.0.132
A Route Table for each subnet is created in the Green deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.2.0.132

Note: Some network-connected PaaS services, e.g. API Management or SQL Managed Instance, require additional routes to the “control plane” that will bypass the local firewall.

Site-to-Site VPN

In this scenario, the organisation is connecting on-premises networks to 1 or more of the hub-and-spoke deployments with a site-to-site VPN connection. That connection goes to the hub of Blue and to Green hubs.

To connect Blue and Green you will need to configure VNet Peering, which can work inside a region or across regions (using Microsoft’s low latency WAN, the second-largest private WAN on the planet). Each end of peering needs the following settings (the names of the settings change so I’m not checking their exact naming):

Enabled: Yes
Allow Transit: Yes
Use Remote Gateway: No
Allow Gateway Sharing: No

Let’s go back and do some routing theory!

That peering connection will add a hidden Default (“system”) route to each subnet in the hub subnets:

Blue hub subnets: A route to 10.2.0.0/24
Green hub subnets: A route to 10.1.0.0/24

Now imagine you are a packet in Spoke 1 trying to get to Spoke 4. You’re sent to the firewall in Blue Hub 1. The firewall lets the traffic out (if a rule allows it) and now the packet sits in the egress/frontend/firewall subnet and is trying to find a route to 10.2.2.0/24. The peering-created Default route covers 10.2.0.0/24 but not the subnet for Spoke 4. So that means the default route to 0.0.0.0/0 (Internet) will be used and the packet is lost.

To fix this you will need to add a Route Table to the egress/frontend/firewall subnet in each hub:

Blue firewall subnet Route Table: 10.2.0.0/16 via virtual appliance 10.2.0.132
Red firewall subnet Route Table: 10.1.0.0/16 via virtual appliance 10.1.0.132

Thanks to my clever addressing of each hub-and-spoke, a single route will cover all packets leaving Blue and trying to get to any spoke in Red and vice-versa.

ExpressRoute

Now the customer has decided to use ExpressRoute to connect to Azure – Sweet! But guess what – you don’t need 1 expensive circuit to each hub-and-spoke.

You can share a single circuit across multiple ExpressRoute gateways:

ExpressRoute Standard: Up to 10 simultaneous connections to Virtual Network Gateways in 1+ regions in the same geopolitical region.
ExpressRoute Premium: Up to 100 simultaneous connections to Virtual Network Gateways in 1+ regions in any geopolitical region.

FYI, ExpressRoute connections to the Azure Virtual WAN Hub must be of the Premium SKU.

ExpressRoute is powered by BGP. All the on-premises routes that are advertised propagate through the ISP to the Microsoft edge router (“meet-me”) in the edge data centre. For example, if I want an ExpressRoute circuit to Azure West Europe (Middenmeer, Netherlands – not Amsterdam) I will probably (not always) get a circuit to the POP or edge data centre in Amsterdam. That gets me a physical low-latency connection onto the Microsoft WAN – and my BGP routes get to the meet-me router in Amsterdam. Now I can route to locations on that WAN. If I connect a VNet Gateway to that circuit to Blue in Azure West Europe, then my BGP routes will propagate from the meet-me router to the GatewaySubnet in the Blue hub, and then on to my firewall subnet.

BGP propagation is disabled in the spoke Route Tables to ensure all outbound flows go through the local firewall.

But that is not the extent of things! The hub-and-spoke peering connections allow Gateway Sharing from the hub and Use Remote Gateway from the spoke. With that configuration, BGP routes to the spoke get propagated to the GatewaySubnet in the hub, then to the meet-me router, through the ISP and then to the on-premises network. This is what our solution is based on.

Let’s imagine that the Green deployment is in North Europe (Dublin, Ireland). I could get a second ExpressRoute connection but:

That will add cost
Not give me the clever solution that I want – but I could work around that with ExpressRoute Global Reach

I’m going to keep this simple – by the way, if I wanted Green to be in a different geopolitical region such as East US 2 then I could use ExpressRoute Premium to make this work.

In the Green hub, the Virtual Network Gateway will connect to the existing ExpressRoute circuit – no more money to the ISP! That means Green will connect to the same meet-me router as Blue. The on-premises routes will get into Green the exact same way as with Blue. And the routes to the Green spokes will also propagate down to on-premises via the meet-me router. That meet-me router knows all about the subnets in Blue and Green. And guess what BGP routers do? They propagate – so, the routes to all of the Blue subnets propagate to Green and vice-versa with the next hop (after the Virtual Network Gateway) being the meet-me router. There are no Route Tables or peering required in the hubs – it just works!

Now the path from Blue Spoke 1 to Green Spoke 4 is Blue Hub Firewall, Blue Virtual Network Gateway, <the Microsoft WAN>, Microsoft (meet-me) Router, <the Microsoft WAN>, Green Virtual Network Gateway, Green Hub Firewall, Green Spoke 4.

There are ways to make this scenario more interesting. Let’s say I have an office in London and I want to use Microsoft Azure. Some stuff will reside in UK South for compliance or performance reasons. But UK South is not a “hero region” as Microsoft calls them. There might be more advanced features that I want to use that are only in West Europe. I could use two ExpressRoute circuits, one to UK South and one to West Europe. Or I could set up a single circuit to London to get me onto the Microsoft WAN and connected this circuit to both of my deployments in UK South and West Europe. I have a quicker route going Office > ISP > London edge data center > Azure West Europe than from Office > ISP > Amsterdam edge data center > Azure West Europe because I have reduced the latency between me and West Europe by reducing the length of the ISP circuit and using the more-direct Microsoft WAN. Just like with Azure Front Door, you want to get onto the Microsoft WAN as quickly as possible and let it get you to your destination as quickly as possible.

Private Connections to Azure PaaS Services

In this post, I’d like to explain a few options you have to get secure/private connections to Azure’s platform-as-a-service offerings.

Express Route – Microsoft Peering

ExpressRoute comes in a few forms, but at a basic level, it’s a “WAN” connection to Azure virtual networks via one or more virtual network gateways; Customers this private peering to connect on-premises networks to Azure virtual networks over an SLA-protected private circuit. However, there is another form of peering that you can do over an ExpressRoute circuit called Microsoft peering. This is where you can use your private circuit to connect to Microsoft cloud services that are normally connected to over the public Internet. What you get:

Private access to PaaS services from your on-premises networks.
Access to an entire service, such as Azure SQL.
A wide array of Azure and non-Azure Microsoft cloud services.

FYI, Office 365 is often mentioned here. In theory, you can access Office 365 over Microsoft peering/ExpressRoute. However, the Office 365 group must first grant you permission to do this – the last I checked, you had to have legal proof of a regulatory need for private access to Cloud services.

Service Endpoint

Imagine that you are running some resources in Azure, such as virtual machines or App Service Environment (ASE); these are virtual network integrated services. Now consider that these services might need to connect to other services such as storage accounts, Azure SQL, or others. Normally, when a VNet connected resource is communicating with, say, Azure SQL, the packets will be routed to “Internet” via the 0.0.0.0/0 default route for the subnet – “Internet” is everywhere outside the virtual network, not necessarily The Internet. The flow will hit the “public” Azure backbone and route to the Azure SQL compute cluster. There are two things about that flow:

It is indirect and introduces latency.
It traverses a shared network space.
A growing number of Azure-only services that support service endpoints.

A growing number of services, including storage accounts, Azure SQL, Cosmos DB, and Key Vault, all have services endpoints available to them. You can enable a service endpoint anywhere in the route from the VM (or whatever) to “Internet” and the packets will “drop” through the service endpoint to the required Azure service – make sure that any firewall in the service accepts packets from the private subnet IP address of the source (VM or whatever). Now you have a more direct and more private connection to the platform service in Azure from your VNet. What you get:

Private access to PaaS services from your Azure virtual networks.
Access to an entire service, such as Azure SQL, but you can limit this to a region.

Service Endpoint Trick #1

Did you notice in the previous section on service endpoints that I said:

You can enable a service endpoint anywhere in the route from the VM (or whatever) to “Internet”

Imagine you have a complex network and not everyone enables service endpoints the way that they should. But you manage the firewall, the public IPs, and the routing. Well, my friend, you can force traffic to support Azure platform services via service endpoints. If you have a firewall, then your routes to “Internet” should direct outbound traffic through the firewall. In the firewall (frontend) subnet, you can enable all the Azure service endpoints. Now when packets egress the firewall, they will “drop” through the service endpoints and to the desired Azure platform service, without ever reaching “Internet”.

Service Endpoint Trick #2

You might know that I like Azure Firewall. Here’s a trick that the Azure networking teams shared with me – it’s similar to the above one but is for on-premises clients trying to access Azure platform services.

You’ve got a VPN connection to a complex virtual network architecture in Azure. And at the frontend of this architecture is Azure Firewall, sitting in the AzureFirewallSubnet; in this subnet you enabled all the available service endpoints. Let’s say that someone wants to connect to Azure SQL using Power BI on their on-premises desktop. Normally that traffic will go over the Internet. What you can do is configure name resolution on your network (or PC) for the database to point at the private IP address of the Azure Firewall. Now Power BI will forward traffic to Azure Firewall, which will relay you to Azure SQL via the service endpoint. What you get:

Private access to PaaS services from your on-premises or Azure networks.
Access to individual instances of a service, such as an Azure SQL server
A growing number of Azure-only services that support service endpoints.

Private Link

In this post, I’m focusing on only one of the 3 current scenarios for Private Link, which is currently in unsupported preview in limited US regions only, for limited platform services – in other words, it’s early days.

This approach aims to give a similar solution to the above “Service Endpoint Trick #2” without the use of trickery. You can connect an instance of an Azure platform service to a virtual network using Private Link. That instance will now have a private IP address on the VNet subnet, making it fully routable on your virtual network. The private link gets a globally unique record in the Microsoft-managed privatelink.database.windows.net DNS zone. For example, your Azure SQL Server would now be resolvable to the private IP address of the private link as yourazuresqlsvr.privatelink.database.windows.net. Now your clients, be the in Azure or on-premises, can connect to this DNS name/IP address to connect to this Azure SQL instance. What you get:

Private access to PaaS services from your on-premises or Azure networks.
Access to individual instances of a service, such as an Azure SQL server.
(PREVIEW LIMITATIONS) A limited number of platform services in limited US-only regions.