Azure Virtual WAN ARM – The Chicken & Egg Gateway ID Discombobulation

This post will explain how to deal with the gateway ID properties in the Azure Microsoft.Network/virtualhubs resource when using ARM templates.

Background

The Azure WAN Hub is capable of having 3 gateway sub-resources:

  • Point-to-site VPN: Microsoft.Network/p2sVpnGateways
  • VPN (site-to-site): Microsoft.Network/vpnGateways
  • ExpressRoute: Microsoft.Network/expressRouteGateways, which does not support diagnostic settings in the 2020-04-01 API

As you would expect, when you create these resources, you have to supply them with the resource ID of the Microsoft.Network/virtualhubs resource:

"virtualHub": {
  "id": "<<<<resource ID of the virtual hub>>>>"
},

What is a surprise is what happens in the Microsoft.Network/virtualhubs resource. After a gateway is associated, a property (type object, presumably for future-proofing) for the associated gateway type is added to the hub:

"vpnGateway": {
  "id": "<<<< Resource ID of Microsoft.Network/vpnGateways resource>>>>"
},
"expressRouteGateway": { 
 "id": "<<<< Resource ID of Microsoft.Network/p2sVpnGateways resource>>>>"
},
"p2SVpnGateway": { 
 "id": "<<<< Resource ID of Microsoft.Network/expressRouteGateways resource>>>>"
},

The surprising thing is what happens.

The Problem

There are 3 possible states in the hub when it comes to each gateway:

  1. The hub exists without a gateway: The above hub properties are not required.
  2. The gateways are being added: The above hub properties cannot be added because the gateway resource ID points to a resource that does not exist yet – the hub must exist and be configured before the gateway(s).
  3. The gateways exist: Any re-run of the ARM template (which might be common to update the hub route tables or configuration via DevOps) must include the above gateway properties in the hub resource with the correct resource IDs for the gateways.

And steps 2 and 3 are where the chicken and egg are in an ARM template. You must supply the gateway resource ID in the hub for all updates to the hub after a gateway is deployed, and you must not include the gateway resource ID in the hub when deploying the gateway. This would be easy to deal with if ARM would (finally) give us a “ifexists()” function but there is no sign of that. So we need a hack solution.

The Hack Solution

This one comes from the Well-Architected Framework/Cloud Adoption Framework, Enterprise-Scale Architecture. This way-too-complicated beastie shows how Microsoft’s people are dealing with the issue. The JSON for the Microsoft.Network/virtualhubs template contains these properties:

"properties": {
  "virtualWan": {
    "id": "[variables('vwanresourceid')]"
  },
  "addressPrefix": "[parameters('vHUB').addressPrefix]",
  "vpnGateway": "[if(not(empty(parameters('vHUB').vpnGateway)),parameters('vHUB').vpnGateway, json('null'))]"
}

The key for dealing with vpnGateway is the vHUB parameter, an object that contains a value called vpnGateway.

When they first run the deployment, the value of vHUB.vpngateway is set to {} or null in the parameters file, stored in GitHub. That means that when the hub is first run (and there is no VPN gateway), the if statement in the above snippet will pass json(‘null’) to the vpnGateway property. That is acceptable to the resource provider and the hub will deploy cleanly. Later on in the deployment, the VPN gateway will be created.

If you were to just re-run the hub template now, you will get an error about not being allowed to change the vpnGateway property in the hub resource. Behind the scenes it has been updated by the VPN gateway deployment. Every execution of the hub template must now include the resource ID of the VPN Gateway – that sucks, right? Now the hack really kicks in.

After the first deployment of the hub (and the VPN Gateway), you must open the resource group in the Azure Portal, enable viewing hidden items, open the VPN Gateway resource, go to properties, and document the resource ID.

Now, you need to open the parameters file for the hub. Edit the vHUB.vpnGateway property and set it to:

"vpnGateway": { 
 "id": "<<<< Resource ID of Microsoft.Network/vpnGateways resource>>>>"
},

Now you can cleanly re-run the hub template.

How Should It Work?

The best solution would be if the gateway ID properties were just documentation for Azure, properties that we humans cannot edit. But I suspect that the ability to configure these settings might have something to do with the newly announced NVA-in-hub preview. Otherwise, ARM needs to finally give us an ifexists() function – vote here now if you agree.

Azure Virtual WAN ARM – Secured Virtual Hub Azure Firewall

I have spent quite a few hours figuring out how to deploy Azure’s new Secured Virtual Hub, an extension of Azure Virtual WAN, deployed using ARM templates (JSON). A lot of the bits are either not documented or incorrectly documented. One of the frustrating bits to deploy was the Azure Firewall resource – and the online examples did not help.

The issue was that the 2 sources I could find did not include public IP addresses on the firewall:

  • The quick start for Secured Virtual Hub on docs.microsoft.com
  • The new Enterprise-Scale “well-architected” Framework, found in Cloud Adoption Framework

Digging to solve that uncovered:

  • The examples used quite an old API version, 2019-08-01, to deploy the Microsoft.Network/azureFirewalls resource.
  • There was no example of how to add a public IP address to the firewall in Secured Virtual Hub because it was not possible with that API – SVH is quite different from a VNet deployment because you do have direct access to the underlying hub virtual network.
  • Being an old API, we lose features such as SNAT for non-RFC1918 addresses (important in universities and public sector) and the newer custom & proxy DNS features.

In my digging, I did uncover that the ARM reference for the Azure Firewall was incorrect, but I did uncover a new, barely-documented property called hubIPAddresses; I knew this property was the key to solving the public IP address issue. So I thought about what was going on and how I was going to solve it.

I ended up doing what I would normally do if I did not have a quick start template to start with:

  1. Deploy the resource(s) by hand in the Azure Portal
  2. Observe the options – there was a slide control for the quantity of firewall public IP addresses
  3. Export the resulting template

And … there was the solution:

  1. There is a new, undocumented API version for the Azure Firewall resource: 2020-05-01
  2. There is a new object property called hubIPAddresses that contains an object sub-property called publicIps. You can set a string value called count to control how many public IP addresses that Azure will assign (on your behalf) to the firewall – you do not need to create the public IP address resources.
        "hubIPAddresses": {
          "publicIPs": {
            "count": "[parameters('firewallPublicIpQuantity')]",
          }
        }

Sorted!