Azure Virtual WAN – Connectivity

In this post, I’ll explain how Azure Virtual WAN offers its core service: connections.

SD-WAN

Some of you might be thinking – this is just for large corporations and I’m outta here. Don’t run just yet. Azure Virtual WAN is a rethinking of how to:

  • Connect users to Azure services and on-premises at the same time
  • Connect sites to Azure and (optionally) other sites
  • Replace the legacy hardware-defined WAN
  • Connect Azure virtual networks together.

That first point is quite timely – connecting users to services. Work-from-home (WFH) has forced enterprises to find ways to connect users to services no matter where they are. That connectivity was often limited to a privileged few. The pandemic forced small/large organisations to re-think productivity connectivity and to scale out. Before COVID19 struck, I was starting to encounter businesses that were considering (some even starting) to replace their legacy MPLS WAN with a software-defined WAN (SD-WAN) where media of different types, suitable to different kinds of sites/users/services, were aggregated via appliances; this SD-WAN is lower cost, more flexible, and by leveraging local connectivity, enables smaller locations, such as offices or retail outlets, to have an affordable direct connection to the cloud for better performance. How the on-premises part of the SD-WAN is managed is completely up to you; some will take direct control and some will outsource it to a network service provider.

Connections

Azure Virtual WAN is all about connections. When you start to read about the new Custom Routing model in Azure Virtual WAN, you’ll see how route tables are associated with connections. In summary, a connection is a link between an on-premises location (referred to as a branch, even if it’s HQ) or a spoke virtual network with a Hub. And now we need to talk about some Azure resources.

Azure Resources

I’ve provided lots more depth on this topic elsewhere so I will keep this to the basics. There are two core resources in Azure Virtual WAN:

  • A Virtual WAN
  • A Hub

The Virtual WAN is a logical resource that provides a global service, although it is actually located in one Azure region. Any hubs that are connected to this Virtual WAN resource can talk to each other (automatically), route it’s connections to another hub’s connections and share resources.

A Virtual WAN Hub is similar to a hub in an Azure hub & spoke architecture. It is a central routing point (with a hidden virtual router) that is the meeting point for any connections to that hub. An Azure region can have 1 hub in your tenant. That means I can have 1 Hub in West Europe and 1 Hub in East US. The Hubs must be connected to an Azure WAN resource; if they share a WAN resource then their connections can talk to each other. I might have all my branches in Europe connect to the Hub in West Europe, and I will connect all my spoke virtual networks in West Europe to the Hub in West Europe too; this means that by default (and I can control this):

  • The virtual networks can route to each other
  • The virtual networks can route to the branches
  • The branches can route to the virtual networks
  • The branches can route to other branches

We can extend this routing by connecting my branches in North America to the East US Hub and the spoke virtual networks in East US to the East US Hub. Yes; all those North American locations can route to each other. Because the Hubs are connected to a common Virtual WAN, the routing now extends across the Microsoft WAN. That means a retail outlet in the further reaches of northwest rural Ireland can connect to services hosted in East US, via a connection to the Hub in West Europe, and then hopping across the Atlantic Ocean using Microsoft’s low-latency WAN. Nice, right? Even, better – it routes just like that automatically if you are using SD-WAN appliances in the branches.

Note that a managed WAN might wire up that retail outlet differently, but still provide a fairly low-latency connection to the local Hub.

Branch Connections

If you have done any Azure networking then you are probably familiar with:

  • Site-to-site VPN: Connecting a location with a cost-effective but no-SLA VPN tunnel to Azure.
  • ExpressRoute: A circuit rented from an ISP for low-latency, high bandwidth, and an SLA-supported private connection to Azure
  • Point-to-Site VPN: Enabling end-users to create a private VPN tunnel to Azure from their devices while on the move or working from home

Each of the above is enabled in Azure using a Virtual Network Gateway, each running independently. Routing from branch to branch is not an intended purpose. Routing from user to the branch is not an intended purpose. The Virtual Network Gateway’s job is to connect a user to Azure.

The Azure Virtual WAN Hub supports gateways – as hidden resources that must be enabled and configured. All three of the above media types are supported as 3 different types of gateway, sized based on a billing concept called scale units – more scale units means more bandwidth and more cost, with a maximum hub throughput of 40 Gbps (including traffic to/from/between spokes).

Note that a Secured Virtual Hub, featuring the Azure Firewall, has a limit of 30 Gbps if all traffic is routed through that firewall.

You can be flexible with the branch connections. Some locations might be small and have a VPN connection to the Hub. Other locations might require an SLA and use ExpressRoute. Some might require low latency or greater bandwidth and use higher SKUs of ExpressRoute. And of course, some users will be on the move or at home and use P2S VPN. A combination of all 3 connection types can be used at once, providing each location and user the connections and costs that suit them best.

ExpressRoute

You will be using ExpressRoute Standard for Azure Virtual WAN; this is a requirement. I don’t think there’s really too much more to say here – the tech just works once the circuit is up, and a combination of Global Reach and the any-to-any connections/routing of Azure WAN means that things will just work.

Site-to-Site VPN

The VPN gateway is deployed in an active/active cluster configuration with two public IP addresses. A branch using VPN for connectivity can have:

  • A single VPN connection over a single ISP connection.
  • Resilient VPN connections over two ISP connections, ideally with different physical providers or even media types.

An on-premises SD-WAN appliance is strongly recommended for Azure Virtual WAN, but you can use any VPN appliance that is supported for route-based VPN by Microsoft Azure; if you are doing the latter you can use BGP or the Azure WAN alternative to Local Network Gateway-provided prefixes for routing to on-premises.

Point-to-Site (P2S) VPN

The P2S gateway offers a superior service to what you might have observed with the traditional Virtual Network Gateway for VPN. Connectivity from the user device is to a hub with a routing appliance. Any-to-any connectivity treats the user device as a branch, albeit in a dedicated network address space. Once the user has connected the VPN tunnel, they can route to (by default):

  • Any spoke virtual network connected to the Hub
  • Any spoke virtual network connected to another Hub on the same Virtual WAN
  • Any branch office connected to any Hub on the Virtual WAN

In summary, the user is connected to the WAN as a result of being connected to the Hub and is subject to the routing and firewall configurations of that Hub. That’s a pretty nice WFH connectivity solution.

Note that you have support for certificate and RADIUS authentication in P2S VPN, as well as the OpenVPN and Microsoft client.

The Connectivity Experience

Imagine we’re back in normal times again with common business travel. A user in Amsterdam could sit down at their desk in the office and connect to services in West Europe via VPN. They could travel to a small office in Luxembourg and connect to the same services via VPN with no discernible difference. That user could travel to a conference in London and use P2S VPN from their hotel room to connect via the Amsterdam Hub. Now that user might get a jet to Philadelphia, and use their mobile hotspot to offer connectivity to the Azure Virtual WAN Hub in East US via P2S VPN – and the experience is no different!

One concept I would like to try out and get a support statement on is to abstract the IP addresses and locations of the P2S gateways using Azure Traffic Manager so the user only needs to VPN to a single FQDN and is directed (using the performance profile) to the closest (latency) Hub in the Virtual WAN with a P2S gateway.

Simplicity

So much is done for you with Azure Virtual WAN. If you like to click in the Azure Portal, it’s a pretty simple set up to get things going, although security engineering looks to have a steep learning curve with Custom Routing. By default, everything is connected to everything; that’s what a network should do. You shouldn’t have to figure out how to route from A to B. I believe that Azure WAN will offer a superior connectivity solution, even for a single location organisation. That’s why I’ve been spending time figuring this tech out over the last few weeks.

The Office – Paint & Electrics Complete

The installation and painting of the Cloud Mechanix global HQ have completed. On a damp day, the painters arrived and started work – sealing the knots in the wood and applying the first coat. We decided to go with a different set of colours to the house; there were two reasons for the decision:

  • To separate work from home.
  • To “hide” the office in the corner and in the trees surrounding that corner.

So dark green was used on the outside walls with white for the trims. The interior walls and ceilings are also painted white to give the place a bright feel. The floors have not been painted, but more on that in a moment. The electricians arrived on the first day of painting and installed the electrical components. They didn’t have the ethernet cabling so they had to return a couple of days later to do that work. The painting took 2 full days with 2 painters – that’s without sanding and painting/staining the floor like many would opt to do.

And that brings us to the floor. We decided to install AC4 (hard wearing) laminate flooring – if you don’t know what that is, it’s artificial wood flooring that has the pattern, colour and texture of finished wood without the maintenance. We found a style we like, moor acacia, that is dark to contrast with the interior white, and has a grainy finish like oak. It will pop nicely with the white furnishings. We went to two local suppliers. Both had identical pricing on the supply and similar pricing on the installation. However, one has a month of work queued up and the other has 2 months. We tried an independent installer where we would supply the materials but the price was too high with a non-guaranteed schedule. So we’ll be waiting 1 month to get the floor installed – no, that’s not work for me – trust me, I attempted it once before in a small room that took way too long and didn’t look good after.

I installed the Wi-Fi to the fixed ethernet connection in the office using a spare Linkys mesh unit that I had. It tested well, similar to the same tests in the house, closer to the broadband modem. No; I did not opt to install RJ45 sockets all around the office. Wi-Fi will be reliable (antenna in the office), flexible, and not take away from the appearance of the interior. Not to mention that many things expect Wi-Fi connections these days anyway!

I also shopped online for furniture. I know what I want – most of it is from Ikea – but the best delivery I could get from them was 5-6 weeks!!! So the whole floor thing won’t impact us anyway. The only real decision I have not made is what to do with the office chairs. I know many of you will recommend Secret Labs but their delivery times are widely mocked.

As you can see in the photo, there are a few other works to be done:

  • The swing set is being scrapped. A much nicer wooden set is coming next week and a local handyman will take the scrap away to recycle it.
  • I had some paving stones that I had saved from the site that the office is installed on. I will be power washing them.
  • The site of the old swing set left some damage to the lawn – the anchors and foot damage. I have filled the holes and I will be sowing new grass.

Some of the garden work will happen this week because I’ve taken some time off from work. Other than that, it will be a while before there are any updates on this project. There is one other thing to share:

The above slate sign arrived in the post today. It was a present from my wife to hang on/in the office. It’s written in Norwegian (I work for Innofactor Norway) and explicitly refers to the 3 x tree stumps that had to be dug up by hand to clear space for the office.

Azure Virtual WAN ARM – The Resources

In this post, I will explain the types of resources used in Azure Virtual WAN and the nature of their relationships.

Note, I have not included any content on the recently announced preview of third-party NVAs. I have not seen any materials on this yet to base such a post on and, being honest, I don’t have any use-cases for third-party NVAs.

As you can see – there are quite a few resources involved … and some that you won’t see listed at all because of the “appliance-like” nature of the deployment. I have not included any detail on spokes or “branch offices”, which would require further resources. The below diagram is enough to get a hub operational and connected to on-premises locations and spoke virtual networks.

The Virtual WAN – Microsoft.Network/virtualWans

You need at least one Virtual WAN to be deployed. This is what the hub will connect to, and you can connect many hubs to a common Virtual WAN to get automated any-to-any connectivity across the Microsoft physical WAN.

Surprisingly, the resource is deployed to an Azure region and not as a global resource, such as other global resources such as Traffic Manager or Azure DNS.

The Virtual Hub – Microsoft.Network/virtualHubs

Also known as the hub, the Virtual Hub is deployed once, and once only, per Azure region where you need a hub. This hub replaces the old hub virtual network (plus gateway(s), plus firewall, plus route tables) deployment you might be used to. The hub is deployed as a hidden resource, managed through the Virtual WAN in the Azure Portal or via scripting/ARM.

The hub is associated with the Virtual WAN through a virtualWAN property that references the resource ID of the virtualWans resource.

In a previous post, I referred to a chicken & egg scenario with the virtualHubs resource. The hub has properties that point to the resource IDs of each deployed gateway:

  • vpnGateway: For site-to-site VPN.
  • expressRouteGateway: For ExpressRoute circuit connectivity.
  • p2sVpnGateway: For end-user/device tunnels.

If you choose to deploy a “Secured Virtual Hub” there will also be a property called azureFirewall that will point to the resource ID of an Azure Firewall with the AZFW_Hub SKU.

Note, the restriction of 1 hub per Azure region does introduce a bottleneck. Under the covers of the platform, there is actually a virtual network. The only clue to this network will be in the peering properties of your spoke virtual networks. A single virtual network can have, today, a maximum of 500 spokes. So that means you will have a maximum of 500 spokes per Azure region.

Routing Tables – Microsoft.Network/virtualHubs/hubRouteTables & Microsoft.Network/virtualHubs/routeTables

These are resources that are used in custom routing, a recently announced as GA feature that won’t be live until August 3rd, according to the Azure Portal. The resource control the flows of traffic in your hub and spoke architecture. They are child-resources of the virtualHubs resource so no references of hub resource IDs are required.

Azure Firewall – Microsoft.Network/azureFirewalls

This is an optional resource that is deployed when you want a “Secured Virtual Hub”. Today, this is the only way to put a firewall into the hub, although a new preview program should make it possible for third-parties to join the hub. Alternatively, you can use custom routing to force north-south and east-west traffic through an NVA that is running in a spoke, although that will double peering costs.

The Azure Firewall is deployed with the AZFW_Hub SKU. The firewall is not a hidden resource. To manage the firewall, you must use an Azure Firewall Policy (aka Azure Firewall Manager). The firewall has a property called firewallPolicy that points to the resource ID of a firewallPolicies resource.

Azure Firewall Policy – Microsoft.Network/firewallPolicies

This is a resource that allows you to manage an Azure Firewall, in this case, an AZFW_Hub SKU of Azure Firewall. Although not shown here, you can deploy a parent/child configuration of policies to manage firewall configurations and rules in a global/local way.

VPN Gateway – Microsoft.Network/vpnGateways

This is one of 3 ways (one, two or all three at once) that you can connect on-premises (branch) sites to the hub and your Azure deployment(s). This gateway provides you with site-to-site connectivity using VPN. The VPN Gateway uses a property called virtualHub to point at the resource ID of the associated hub or virtualHubs resource. This is a hidden resource.

Note that the virtualHubs resource must also point at the resource ID of the VPN gateway resource ID using a property called vpnGateway.

ExpressRoute Gateway – Microsoft.Network/expressRouteGateways

This is one of 3 ways (one, two or all three at once) that you can connect on-premises (branch) sites to the hub and your Azure deployment(s). This gateway provides you with site-to-site connectivity using ExpressRoute. The ExpressRoute Gateway uses a property called virtualHub to point at the resource ID of the associated hub or virtualHubs resource. This is a hidden resource.

Note that the virtualHubs resource must also point at the resource ID of the ExpressRoute gateway resource ID using a property called p2sGateway.

Point-to-Site Gateway – Microsoft.Network/p2sVpnGateways

This is one of 3 ways (one, two or all three at once) that you can connect on-premises (branch) sites to the hub and your Azure deployment(s). This gateway provides users/devices with connectivity using VPN tunnels. The Point-to-Site Gateway uses a property called virtualHub to point at the resource ID of the associated hub or virtualHubs resource. This is a hidden resource.

The Point-to-Site Gateway inherits a VPN configuration from a VPN configuration resource based on Microsoft.Network/vpnServerConfigurations, referring to the configuration resource by its resource ID using a property called vpnServerConfiguration.

Note that the virtualHubs resource must also point at the resource ID of the Point-to-Site gateway resource ID using a property called p2sVpnGateway.

VPN Server Configuration – Microsoft.Network/vpnServerConfigurations

This configuration for Point-to-Site VPN gateways can be seen in the Azure WAN and is intended as a shared configuration that is reusable with more than one Point-to-Site VPN Gateway. To be honest, I can see myself using it as a per-region configuration because of some values like DNS servers and RADIUS servers that will probably be placed per-region for performance and resilience reasons. This is a hidden resource.

The following resources were added on 22nd July 2020:

VPN Sites – Microsoft.Network/vpnSites

This resource has a similar purpose to a Local Network Gateway for site-to-site VPN connections; it describes the on-premises location, AKA “branch office”.  A VPN site can be associated with one or many hubs, so it is actually connected to the Virtual WAN resource ID using a property called virtualWan. This is a hidden resource.

An array property called vpnSiteLinks describes possible connections to on-premises firewall devices.

VPN Connections – Microsoft.Network/vpnGateways/vpnConnections

A VPN Connections resource associates a VPN Gateway with the on-premises location that is described by an associated VPN Site. The vpnConnections resource is a child resource of vpnGateways, so there is no actual resource; the vpnConnections resource takes its name from the parent VPN Gateway, and the resource ID is an extension of the parent VPN Gateway resource ID.

By necessity, there is some complexity with this resource type. The remoteVpnSite property links the vpnConnections resource with the resource ID of a VPN Site resource. An array property, called vpnSiteLinkConnections, is used to connect the gateway to the on-premises location using 1 or 2 connections, each linking from vpnSiteLinkConnections to the resource/property ID of 1 or 2 vpnSiteLinks properties in the VPN Site. With one site link connection, you have a single VPN tunnel to the on-premises location. With 2 link connections, the VPN Gateway will take advantage of its active/active configuration to set up resilient tunnels to the on-premises location.

Virtual Network Connections – Microsoft.Network/virtualHubs/hubVirtualNetworkConnections

The purpose of a hub is to share resources with spoke virtual networks. In the case of the Virtual Hub, those resources are gateways, and maybe a firewall in the case of Secured Virtual Hub. As with a normal VNet-based hub & spoke, VNet peering is used. However, the way that VNet peering is used changes with the Virtual Hub; the deployment is done using the hub/VirtualNetworkConnections child resource, whose parent is the Virtual Hub. Therefore, the name and resource ID are based on the name and resource ID of the Virtual Hub resource.

The deployment is rather simple; you create a Virtual Network Connection in the hub specifying the resource ID of the spoke virtual network, using a property called remoteVirtualNetwork. The underlying resource provider will initiate both sides of the peering connection on your behalf – there is no deployment required in the spoke virtual network resource. The Virtual Network Connection will reference the Hub Route Tables in the hub to configure route association and propagation.

More Resources

There are more resources that I’ve yet to document, including:

Azure Virtual WAN ARM – The Chicken & Egg Gateway ID Discombobulation

This post will explain how to deal with the gateway ID properties in the Azure Microsoft.Network/virtualhubs resource when using ARM templates.

Background

The Azure WAN Hub is capable of having 3 gateway sub-resources:

  • Point-to-site VPN: Microsoft.Network/p2sVpnGateways
  • VPN (site-to-site): Microsoft.Network/vpnGateways
  • ExpressRoute: Microsoft.Network/expressRouteGateways, which does not support diagnostic settings in the 2020-04-01 API

As you would expect, when you create these resources, you have to supply them with the resource ID of the Microsoft.Network/virtualhubs resource:

"virtualHub": {
  "id": "<<<<resource ID of the virtual hub>>>>"
},

What is a surprise is what happens in the Microsoft.Network/virtualhubs resource. After a gateway is associated, a property (type object, presumably for future-proofing) for the associated gateway type is added to the hub:

"vpnGateway": {
  "id": "<<<< Resource ID of Microsoft.Network/vpnGateways resource>>>>"
},
"expressRouteGateway": { 
 "id": "<<<< Resource ID of Microsoft.Network/p2sVpnGateways resource>>>>"
},
"p2SVpnGateway": { 
 "id": "<<<< Resource ID of Microsoft.Network/expressRouteGateways resource>>>>"
},

The surprising thing is what happens.

The Problem

There are 3 possible states in the hub when it comes to each gateway:

  1. The hub exists without a gateway: The above hub properties are not required.
  2. The gateways are being added: The above hub properties cannot be added because the gateway resource ID points to a resource that does not exist yet – the hub must exist and be configured before the gateway(s).
  3. The gateways exist: Any re-run of the ARM template (which might be common to update the hub route tables or configuration via DevOps) must include the above gateway properties in the hub resource with the correct resource IDs for the gateways.

And steps 2 and 3 are where the chicken and egg are in an ARM template. You must supply the gateway resource ID in the hub for all updates to the hub after a gateway is deployed, and you must not include the gateway resource ID in the hub when deploying the gateway. This would be easy to deal with if ARM would (finally) give us a “ifexists()” function but there is no sign of that. So we need a hack solution.

The Hack Solution

This one comes from the Well-Architected Framework/Cloud Adoption Framework, Enterprise-Scale Architecture. This way-too-complicated beastie shows how Microsoft’s people are dealing with the issue. The JSON for the Microsoft.Network/virtualhubs template contains these properties:

"properties": {
  "virtualWan": {
    "id": "[variables('vwanresourceid')]"
  },
  "addressPrefix": "[parameters('vHUB').addressPrefix]",
  "vpnGateway": "[if(not(empty(parameters('vHUB').vpnGateway)),parameters('vHUB').vpnGateway, json('null'))]"
}

The key for dealing with vpnGateway is the vHUB parameter, an object that contains a value called vpnGateway.

When they first run the deployment, the value of vHUB.vpngateway is set to {} or null in the parameters file, stored in GitHub. That means that when the hub is first run (and there is no VPN gateway), the if statement in the above snippet will pass json(‘null’) to the vpnGateway property. That is acceptable to the resource provider and the hub will deploy cleanly. Later on in the deployment, the VPN gateway will be created.

If you were to just re-run the hub template now, you will get an error about not being allowed to change the vpnGateway property in the hub resource. Behind the scenes it has been updated by the VPN gateway deployment. Every execution of the hub template must now include the resource ID of the VPN Gateway – that sucks, right? Now the hack really kicks in.

After the first deployment of the hub (and the VPN Gateway), you must open the resource group in the Azure Portal, enable viewing hidden items, open the VPN Gateway resource, go to properties, and document the resource ID.

Now, you need to open the parameters file for the hub. Edit the vHUB.vpnGateway property and set it to:

"vpnGateway": { 
 "id": "<<<< Resource ID of Microsoft.Network/vpnGateways resource>>>>"
},

Now you can cleanly re-run the hub template.

How Should It Work?

The best solution would be if the gateway ID properties were just documentation for Azure, properties that we humans cannot edit. But I suspect that the ability to configure these settings might have something to do with the newly announced NVA-in-hub preview. Otherwise, ARM needs to finally give us an ifexists() function – vote here now if you agree.

Azure Virtual WAN ARM – Secured Virtual Hub Azure Firewall

I have spent quite a few hours figuring out how to deploy Azure’s new Secured Virtual Hub, an extension of Azure Virtual WAN, deployed using ARM templates (JSON). A lot of the bits are either not documented or incorrectly documented. One of the frustrating bits to deploy was the Azure Firewall resource – and the online examples did not help.

The issue was that the 2 sources I could find did not include public IP addresses on the firewall:

  • The quick start for Secured Virtual Hub on docs.microsoft.com
  • The new Enterprise-Scale “well-architected” Framework, found in Cloud Adoption Framework

Digging to solve that uncovered:

  • The examples used quite an old API version, 2019-08-01, to deploy the Microsoft.Network/azureFirewalls resource.
  • There was no example of how to add a public IP address to the firewall in Secured Virtual Hub because it was not possible with that API – SVH is quite different from a VNet deployment because you do have direct access to the underlying hub virtual network.
  • Being an old API, we lose features such as SNAT for non-RFC1918 addresses (important in universities and public sector) and the newer custom & proxy DNS features.

In my digging, I did uncover that the ARM reference for the Azure Firewall was incorrect, but I did uncover a new, barely-documented property called hubIPAddresses; I knew this property was the key to solving the public IP address issue. So I thought about what was going on and how I was going to solve it.

I ended up doing what I would normally do if I did not have a quick start template to start with:

  1. Deploy the resource(s) by hand in the Azure Portal
  2. Observe the options – there was a slide control for the quantity of firewall public IP addresses
  3. Export the resulting template

And … there was the solution:

  1. There is a new, undocumented API version for the Azure Firewall resource: 2020-05-01
  2. There is a new object property called hubIPAddresses that contains an object sub-property called publicIps. You can set a string value called count to control how many public IP addresses that Azure will assign (on your behalf) to the firewall – you do not need to create the public IP address resources.
        "hubIPAddresses": {
          "publicIPs": {
            "count": "[parameters('firewallPublicIpQuantity')]",
          }
        }

Sorted!

Azure Virtual WAN Introducing A New Kind Of Route Table

In this post, I will quickly introduce you to a new kind of Route Table in Microsoft Azure that has been recently introduced by Azure Virtual WAN – and hence is included in the newly generally available Secured Virtual Hub.

The Old “Subnet” Route Table

This Route Table, which I will call “Subnet Route Table” (derived from the ARM name) is a simple resource that we associate with a subnet. It contains User-Defined Routes that force traffic to flow in desirable directions, typically when we use some kind of firewall appliance (Azure Firewall or third-party) or a third-party routing appliance. route The design is simple enough:

  • Name: A user-friendly name
  • Prefix: The CIDR you want to get to
  • Next Hop Type: What kind of “router” is the next hop, e.g. Virtual Network, Internet, or Virtual Appliance
  • Next Hop IP Address: Used when Next Hop Type is Virtual Appliance (any firewall or third-party router)

Azure Virtual WAN Hub

Microsoft introduced Azure Virtual WAN quite a while ago (by Cloud standards), but few still have heard of it, possibly because of how it was originally marketed as an SD-WAN solution compatible originally with just a few on-prem SD-WAN vendors (now a much bigger list). Today it supports IKEv1 and IKEv2 site-to-site VPN, point-to-site VPN, and ExpressRoute Standard (and higher). You might already be familiar with setting up a hub in a hub-and-spoke: you have to create the virtual network, the Route Table for inbound traffic, the firewall, etc. Azure Virtual WAN converts the hub into an appliance-like experience surfacing just two resources: the Virtual WAN (typically 1 global resource per organisation) and the hub (one per Azure region). Peering, routing, connectivity are all simplified.

A more recent change has been the Secured Virtual Hub, where Azure Firewall is a part of the Virtual WAN Hub; this was announced at Ignite and has just gone GA. Choosing the Secured Virtual Hub option adds security to the Virtual WAN Hub. Don’t worry, though, if you prefer a third-party firewall; the new routing model in Azure Virtual WAN Hub allows you to deploy your firewall into a dedicated spoke virtual network and route your isolated traffic through there.

The New Route Tables

There are two new kinds of route table added by the Virtual WAN Hub, or Virtual Hub, both of which are created in the Virtual Hub as sub-resources.

  • Virtual Wan Hub Route Table
  • Virtual WAN Route Table

Virtual WAN Hub Route Table

The Virtual Hub Hub Route Table affects traffic from the Virtual Hub to other locations.  A possible scenario is when you want to route traffic to a CIDR block of virtual network(s) through a third-party firewall (network virtual appliance/NVA):

AzureVirtualWanHubHubRouteTable

The routing rule setup here is similar to the Subnet Route Table, specifying where you want to get to (CIDR, resource ID, or service), the next hop, and a next hop IP address.

Virtual WAN Route Table

The Virtual WAN Route Table is created as a sub resource of the Virtual Hub but it has a different purpose. The Virtual Hub is assigned to connections and affects routing from the associated branch offices or virtual networks. Whoa, Finn! There is a lot of terminology in that sentence!

A connection is just that; it is a connection between the hub and another network. Each spoke connected directly to the hub has a connection to the hub – a Virtual WAN Route Table can be associated with each connection. A Virtual WAN Route Table can be associated with 1 virtual network connection, a subset of them, or all of them.

The term “branch offices” refers to sites connected by ExpressRoute, site-to-site VPN, or point-to-site VPN. Those sites also have connections that a Virtual WAN Route Table can be associated with.

This is a much more interesting form of route table. I haven’t had time to fully get under the covers here, but comparing ARM to the UI reveals two methodologies. The Azure Portal reveals one way of visualising routing that I must admit that I find difficult to scale in my mind. The ARM resource looks much more familiar to me, but until I get into a lab and fully test (which I hope I will find some hours to do soon), I cannot completely document.

Here are the basics of what I have gleaned from the documentation, which covers the Azure Portal method:

The linked documentation is heavy reading. I’m one of those people that needs to play with this stuff before writing too much in detail – I never trust the docs and, to be honest, this content is complicated, as you can see above.

Rethinking Firewall Management With Azure Firewall Manager

Microsoft has just announced the general availability a feature that I’ve been waiting for since I first learned about it last Autumn, called Azure Firewall Manager. Azure Firewall Manager allows you to centrally manage one or more Azure Firewall instances through a central, policy-driven, user interface. And it’s those policies, Azure Firewall Policies, that made me re-think Azure Firewall management a few months ago when I was writing my Cloud Mechanix course (running next ONLINE on July 30th) “Securing Azure Services & Data Through Azure Networking”.

Azure Firewall Policy

This is a new resource type that is generally available today. Azure Firewall Policy outsources the configuration and management of the firewall to a policy resource; that means that the usual settings in the Azure Firewall for things like rules and Threat Intelligence move from the firewall resource to a policy when a policy is associated with the firewall.

Policies can be created in a hierarchy. You can create a parent/global policy that will contain configurations and rules that will apply to all/a number of firewall instances. Then you create a child policy that inherits from the parent; note that rules changes in the parent instantly appear in the child. The child is associated with a firewall and applies configurations/rules from the parent policy and the child policy instantly to the firewall.

Problem

I’ve deployed and configured multiple customers where we have virtual data centers (VDCs, which are governed & secured hub and spoke architectures) across multiple regions. Creating rules configurations to allow flows from a spoke/service in one region to another spoke/service in another region is a royal pain in the tushie. Here’s the network flow (as I documented with routing here):

  1. Source device
  2. Outbound NSG rules in source spoke
  3. Firewall in source hub
  4. Firewall in destination hub
  5. Inbound NSG rules at destination spoke
  6. Destination device

There are potentially 4 sets of rules to configure for a simple service running on a single protocol/port. Today I configured Microsoft Identity Management for this scenario and there were dozens of protocol/port combinations across three spokes. The work took hours to complete – which I did in code and it provided a working result for the identity consulting team.

I minimise the work by controlling outbound flows in the local hub firewall, not in the NSG. So the NSGs do not control outbound flows at all. I could allow all via the firewall, even to other private networks, but that goes against the idea of compartmentalisation or micro-segmentation to combat modern network threats – so I need to configure both firewalls for a flow.

Solution

Re-think the firewall for a moment. Imagine you had one virtual firewall that spanned all of your Azure regional deployments. You can control all global flows with one configuration in that global virtual firewall. The global virtual firewall has instances in each Azure region. Any local flows can be configured just in that instance. That’s what Firewall Policy allows.

  • Parent Policy: Place all your global configurations in here. Some configurations will be company-wide, such as Threat Intelligence. Some rules, like allowing access to Microsoft URIs or Azure services (service tags) will be global too. And this is where you put the rules to allow flows between one regional deployment and another. This global management takes all your local Azure Firewall resources and treats them as a single security service.
  • Child Policies: A child policy will be created for each Azure Firewall instance. This policy will inherit the above from the parent applying the global configuration. Local rules, to allow north-south access to/from local services (Internet or on-prem) or east-west (spoke-to-spoke in the same regional deployment) will be configured here. RBAC can be enabled to allow local network admins to do their own thing, but unable to undo what the parent has done.

I haven’t had a chance to test Azure Firewall Policy out yet since the GA announcement, but I’m hoping that the third tier in rules (Rules Groups) made it from preview to GA. I do have groupings of rules collections based on buckets of priorities. This organisation would be awesome in my vision of Azure Firewall management.

Connecting Azure Hub-And-Spoke Architectures Together

In this post, I will explain how you can connect multiple Azure hub-and-spoke (virtual data centre) deployments together using Azure networking, even across different Azure regions.

There is a lot to know here so here is some recommended reading that I previously published:

If you are using Azure Virtual WAN Hub then some stuff will be different and that scenario is not covered fully here – Azure Virtual WAN Hub has a preview (today) feature for Any-to-Any routing.

The Scenario

In this case, there are two hub-and-spoke deployments:

  • Blue: Multiple virtual networks covered by the CIDR of 10.1.0.0/16
  • Green: Another set of multiple virtual networks covered by the CIDR of 10.2.0.0/16

I’m being strategic with the addressing of each hub-and-spoke deployment, ensuring that a single CIDR will include the hub and all spokes of a single deployment – this will come in handy when we look at User-Defined Routes.

Either of these hub-and-spoke deployments could be in the same region or even in different Azure regions. It is desired that if:

  • Any spoke wishes to talk to another spoke it will route through the local firewall in the local hub.
  • All traffic coming into a spoke from an outside source, such as the other hub-and-spoke, must route through the local firewall in the local hub.

That would mean that Spoke 1 must route through Hub 1 and then Hub 2 to talk to Spoke 4. The firewall can be a third-party appliance or the Azure Firewall.

Core Routing

Each subnet in each spoke needs a route to the outside world (0.0.0.0/0) via the local firewall. For example:

  • The Blue firewall backend/private IP address is 10.1.0.132
  • A Route Table for each subnet is created in the Blue deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.1.0.132
  • The Greenfirewall backend/private IP address is 10.2.0.132
  • A Route Table for each subnet is created in the Green deployment and has a route to 0.0.0.0/0 via a virtual appliance with an IP address of 10.2.0.132

Note: Some network-connected PaaS services, e.g. API Management or SQL Managed Instance, require additional routes to the “control plane” that will bypass the local firewall.

Site-to-Site VPN

In this scenario, the organisation is connecting on-premises networks to 1 or more of the hub-and-spoke deployments with a site-to-site VPN connection. That connection goes to the hub of Blue and to Green hubs.

To connect Blue and Green you will need to configure VNet Peering, which can work inside a region or across regions (using Microsoft’s low latency WAN, the second-largest private WAN on the planet). Each end of peering needs the following settings (the names of the settings change so I’m not checking their exact naming):

  • Enabled: Yes
  • Allow Transit: Yes
  • Use Remote Gateway: No
  • Allow Gateway Sharing: No

Let’s go back and do some routing theory!

That peering connection will add a hidden Default (“system”) route to each subnet in the hub subnets:

  • Blue hub subnets: A route to 10.2.0.0/24
  • Green hub subnets: A route to 10.1.0.0/24

Now imagine you are a packet in Spoke 1 trying to get to Spoke 4. You’re sent to the firewall in Blue Hub 1. The firewall lets the traffic out (if a rule allows it) and now the packet sits in the egress/frontend/firewall subnet and is trying to find a route to 10.2.2.0/24. The peering-created Default route covers 10.2.0.0/24 but not the subnet for Spoke 4. So that means the default route to 0.0.0.0/0 (Internet) will be used and the packet is lost.

To fix this you will need to add a Route Table to the egress/frontend/firewall subnet in each hub:

  • Blue firewall subnet Route Table: 10.2.0.0/16 via virtual appliance 10.2.0.132
  • Red firewall subnet Route Table: 10.1.0.0/16 via virtual appliance 10.1.0.132

Thanks to my clever addressing of each hub-and-spoke, a single route will cover all packets leaving Blue and trying to get to any spoke in Red and vice-versa.

ExpressRoute

Now the customer has decided to use ExpressRoute to connect to Azure – Sweet! But guess what – you don’t need 1 expensive circuit to each hub-and-spoke.

You can share a single circuit across multiple ExpressRoute gateways:

  • ExpressRoute Standard: Up to 10 simultaneous connections to Virtual Network Gateways in 1+ regions in the same geopolitical region.
  • ExpressRoute Premium: Up to 100 simultaneous connections to Virtual Network Gateways in 1+ regions in any geopolitical region.

FYI, ExpressRoute connections to the Azure Virtual WAN Hub must be of the Premium SKU.

ExpressRoute is powered by BGP. All the on-premises routes that are advertised propagate through the ISP to the Microsoft edge router (“meet-me”) in the edge data centre. For example, if I want an ExpressRoute circuit to Azure West Europe (Middenmeer, Netherlands – not Amsterdam) I will probably (not always) get a circuit to the POP or edge data centre in Amsterdam. That gets me a physical low-latency connection onto the Microsoft WAN – and my BGP routes get to the meet-me router in Amsterdam. Now I can route to locations on that WAN. If I connect a VNet Gateway to that circuit to Blue in Azure West Europe, then my BGP routes will propagate from the meet-me router to the GatewaySubnet in the Blue hub, and then on to my firewall subnet.

BGP propagation is disabled in the spoke Route Tables to ensure all outbound flows go through the local firewall.

But that is not the extent of things! The hub-and-spoke peering connections allow Gateway Sharing from the hub and Use Remote Gateway from the spoke. With that configuration, BGP routes to the spoke get propagated to the GatewaySubnet in the hub, then to the meet-me router, through the ISP and then to the on-premises network. This is what our solution is based on.

Let’s imagine that the Green deployment is in North Europe (Dublin, Ireland). I could get a second ExpressRoute connection but:

  • That will add cost
  • Not give me the clever solution that I want – but I could work around that with ExpressRoute Global Reach

I’m going to keep this simple – by the way, if I wanted Green to be in a different geopolitical region such as East US 2 then I could use ExpressRoute Premium to make this work.

In the Green hub, the Virtual Network Gateway will connect to the existing ExpressRoute circuit – no more money to the ISP! That means Green will connect to the same meet-me router as Blue. The on-premises routes will get into Green the exact same way as with Blue. And the routes to the Green spokes will also propagate down to on-premises via the meet-me router. That meet-me router knows all about the subnets in Blue and Green. And guess what BGP routers do? They propagate – so, the routes to all of the Blue subnets propagate to Green and vice-versa with the next hop (after the Virtual Network Gateway) being the meet-me router. There are no Route Tables or peering required in the hubs – it just works!

Now the path from Blue Spoke 1 to Green Spoke 4 is Blue Hub Firewall, Blue Virtual Network Gateway, <the Microsoft WAN>, Microsoft (meet-me) Router, <the Microsoft WAN>, Green Virtual Network Gateway, Green Hub Firewall, Green Spoke 4.

There are ways to make this scenario more interesting. Let’s say I have an office in London and I want to use Microsoft Azure. Some stuff will reside in UK South for compliance or performance reasons. But UK South is not a “hero region” as Microsoft calls them. There might be more advanced features that I want to use that are only in West Europe. I could use two ExpressRoute circuits, one to UK South and one to West Europe. Or I could set up a single circuit to London to get me onto the Microsoft WAN and connected this circuit to both of my deployments in UK South and West Europe. I have a quicker route going Office > ISP > London edge data center > Azure West Europe than from Office > ISP > Amsterdam edge data center > Azure West Europe because I have reduced the latency between me and West Europe by reducing the length of the ISP circuit and using the more-direct Microsoft WAN. Just like with Azure Front Door, you want to get onto the Microsoft WAN as quickly as possible and let it get you to your destination as quickly as possible.

Deploying Azure ARM Templates From Azure DevOps – With A Complete Example

In this post, I will show you how to get those ARM templates sitting in an Azure DevOps repo deploying into Azure using a pipeline. With every merge, the pipeline will automatically trigger (you can disable this) to update the deployment. In other words, a complete CI/CD deployment where you manage your infrastructure/services as code.

Annoyance

I’m not a DevOps guru. I use DevOps every day. Every deployment I do for a customer runs from JSON that I’ve helped write into the customers’ Azure tenants. But we have people who are DevOps gurus and we have one seriously fancy deployment system that literally just uses a DevOps pipeline as a trigger mechanism and nothing more. But I use that, not develop it. I wanted to create & run a pipeline for my own needs (Cloud Mechanix Azure training). Admittedly, I’ve tried this before, lost patience, and abandoned it. This time, I persisted and succeeded.

What didn’t help? The dreadful Microsoft documentation. One doc, from DevOps was rubbish. Another had deprecated YAML code (pipelines are written in YAML). A third had an example that was full of errors. OK, let’s look at blogs. But as with many blogs on this topic, those few that were originals only showed how to push code into an existing App Service and the rest were copies and pastes of App Services posts or bad Microsoft examples.

When it comes to tech like this, I have the feeling that many who have the knowledge don’t like to share it.

Concept

What I’m dealing with here is infrastructure-as-code (Iac). The code (Azure JSON in ARM templates) will describe the resources and configurations of those resources that I want to deploy. In my example, it’s an Azure Firewall and its configuration, including the rules. I have created a repository (repo) in Azure DevOps and I edit the JSON using Visual Studio Code (VS Code), the free version of Visual Studio. When I make a change in VS Code, it will be done in a branch of the master copy of the code. I will sync that branch to the Cloud. To merge the changes, I will create a pull request. This pull request starts a change control process, where the owners of the repo can review the code and decide to accept or reject the changes. If the changes are accepted they are merged into the master copy of the code. And now the magic happens.

A pipeline is a description of a process that will take the master code from the repo and do stuff with it. In my case, deploy the code to a resource group in an Azure subscription. If the resources are already there, then the pipeline will do an update.

I will end up with an Azure Firewall that is managed as code. The rules and configuration are described in a parameter file so that’s all that I should normally need to touch. To make a rules change, I edit the parameter file and do a pull request. A security officer will review the change and approve/reject it. If the change is approved, the new firewall configuration will be deployed. And yes, this approach could probably be used with Azure Firewall Policy resources – I haven’t tested that yet. Now I can give people Read access only to my subscription and force all configuration changes through the pull request review process of Azure DevOps.

Your deployment can be any Azure resources that you can deploy using a template.

Azure Subscription

In Azure I have two resource groups:

  • [Resource Group] p-devops: Where I can do “DevOps stuff”
    • [Storage Account] pdevopsstorsjdhf983: I will use this to store access the code that I want to deploy using the pipeline
  • [Resource Group] p-we1fw: Where my hub virtual network is and the Azure Firewall will be
    • [Virtual Network]: p-we1fw-vnet: The virtual network that contains a subnet called AzureFirewallSubnet

Remember that storage account!

DevOps Repo

I created and configured a DevOps repo called AzureFirewall in a DevOps project. There are two files in there:

  • [Template] azurefirewall.json: The file that will deploy the Azure Firewall
  • [Parameter] azurefirewall-parameters.json: The configuration of the firewall, including the rules!

New DevOps Service Connection

DevOps will need a way to authenticate with your Azure tenant and get authorization to use your tenant, subscription, or resource group. You can get real fancy here. I’m going simple and using a feature of DevOps called a Service Connection, found in DevOps > [Project] >Project Settings > Service Connections (under Pipelines):

  1. Click New Service Connection
  2. Select Azure Resource Manager and hit Next
  3. Select Service Principal (Automatic) which is recommended by DevOps.
  4. Here I selected the subscription option and the Azure subscription that my resource groups are in.
  5. I granted access permission to all pipelines.
  6. I named the service connection after my subscription: p-we1net.

As I said, you can get real fancy here because there are lots of options.

New DevOps Pipeline

Now for the fun!

Back in the project, I went to Pipelines and created a new Pipeline:

  1. I selected Azure Repos Git because I’m storing my code in an Azure DevOps (Git) repo. The contents of this repo will be deployed by the pipeline.
  2. I selected my AzureFirewall repo.
  3. Then I selected “Starter Pipeline”.
  4. An editor appeared – now you’re editing a file called azure-pipelines.yml that resides in the root of your repo.

There is an option (instead of Starter Pipeline) where you choose an existing YAML file, maybe one from a folder called .pipelines in your repo.

Edit the Pipeline

Here is the code:

name: AzureFirewall.$(Date:yyyy.MM.dd)

trigger:
  batch: true

pool:
  name: Hosted Windows 2019 with VS2019

steps:
- task: AzureFileCopy@3
  displayName: 'Stage files'
  inputs:
    SourcePath: ''
    azureSubscription: 'p-we1net'
    Destination: 'AzureBlob'
    storage: 'pdevopsstorsjdhf983'
    ContainerName: 'AzureFirewall'
    outputStorageUri: 'artifactsLocation'
    outputStorageContainerSasToken: 'artifactsLocationSasToken'
    sasTokenTimeOutInMinutes: '240'
- task: AzureResourceGroupDeployment@2
  displayName: 'Deploy template'
  inputs:
     ConnectedServiceName: 'p-we1net'
     action: 'Create Or Update Resource Group'
     resourceGroupName: 'p-we1fw'
     location: 'westeurope'
     templateLocation: 'URL of the file'
     csmFileLink: '$(artifactsLocation)azurefirewall.json$(artifactsLocationSasToken)'
     csmParametersFileLink: '$(artifactsLocation)azurefirewall-parameters.json$(artifactsLocationSasToken)'
     deploymentMode: 'Incremental'
     deploymentName: 'AzureFirewall-Pipeline'

That is a working pipeline. It is made up of several pieces:

Trigger

This controls how the pipeline is started. You can set it to none to stop automatic executions – in the early days when you’re trying to get this right, automatic runs can be annoying.

Pool

Your pipeline is going to run in a container. I’m using a stock Microsoft container based on WS2019. You can supply your own container from Azure Container Registry, but that’s getting fancy!

Task: AzureFileCopy

Now we move into the Steps. The first task is to download the contents of the repo into a storage account. We need to do this because the following deployment task cannot directly access the raw files in Azure DevOps. A task is created with the human friendly name of Stage Files. There are a few settings to configure here:

  • azureSubscription: This is not the name of your subscription! Aint that tricky?! This is the name of the service connection that authenticates the pipeline against the subscription. So that’s my service connection called p-we1net, which I happened to name after my subscription.
  • storage: This is the storage account in my target Azure subscription in the p-devops resource group. My service connection has access to the subscription so it has access to the storage account – be careful with restricting access of the service connection to just a resource group and placing the staging storage account elsewhere.
  • ContainerName: This is the name of the container that will be created in your storage account. The contents of the repo will be downloaded into this container.
  • outputStorageUri: The URI/URL of the storage account/container will be stored in a variable which is called artifactsLocation in this example.
  • outputStorageContainerSasToken: A SAS token will be created to allow temporary secure access to the contents of the container. The token will be stored in a variable called artifactsLocationSasToken in this example.

Task: AzureResourceGroupDeployment

This task will take the contents of the repo from the storage account, and deploy them to a resource group in the target subscription. There are a few things to change:

  • azureSubscription: Once again, specify the name of the service connection, not the Azure subscription.
  • resourceGroupName: Enter the name of the target resource group.
  • location: Specify the Azure region that you are targeting.
  • csmFileLink: This is the URI of the template file that you want to deploy. More in a moment.
  • csmParametersFileLink: This is the URI of the parameters file that you want to deploy. More in a moment.
  • deploymentName: I have hard-set the deployment name so I don’t have to clean up versioned deployments from the resource group later. Every resource group has a hard set limit on deployment objects, and with a resource such as a firewall, that could be hit quite quickly.

csmFileLink

There are three parts to the string: $(artifactsLocation)azurefirewall.json$(artifactsLocationSasToken). Together, the three parts give the task secure access to the template file in the staging storage account.

  • $(artifactsLocation): This is the storage account/container URI/URL variable from the AzureFileCopy task.
  • azurefirewall.json: This is the name of the template file that I want to deploy.
  • $(artifactsLocationSasToken): This is the SAS token variable from the AzureFileCopy task.

csmParametersFileLink

There are three parts to the string: $(artifactsLocation)azurefirewall-parameters.json$(artifactsLocationSasToken). Together, the three parts give the task secure access to the parameter file in the staging storage account.

  • $(artifactsLocation): This is the storage account/container URI/URL variable from the AzureFileCopy task.
  • azurefirewall-parameters.json: This is the name of the parameter file that I want to use to customise the template deployment.
  • $(artifactsLocationSasToken): This is the SAS token variable from the AzureFileCopy task.

Pipeline Execution

There are three ways to run the pipeline now:

  1. Do an update (or a merge) to the master branch of the repo thanks to my trigger.
  2. Manually run the pipeline from Pipelines.
  3. Save a change to the pipeline in the DevOps editor if the master is not locked – which will trigger option 1, to be honest.

You can open the pipeline, or historic runs of it, to view/track the execution:

You’ll also get an email to let you know the status of an ended pipeline run:

Happy pipelining!