Understanding How Azure Application Gateway Works

In this post, I will explain how things such as frontend configurations, listeners, HTTP settings, probes, backend pools, and rules work together to enable service publication in the Azure Web Application Gateway (WAG)/Web Application Firewall (WAF).

Introduction

The WAF/WAG is a scary beast at first. When you open one up there are just so many settings to be tweaked. If you are publishing just a simple test HTTP server, it’s easy: you populate the default backend pool and things just start to work. But if you want HTTPS, or to service many pools/sites, then things get complicated. And frustratingly slow 🙂 – Things have improved in v1 and v2 is significantly faster to configure, although it has architectural limitations (force public IP address and lack of support for route tables) that prevent me from using v2 in my large network deployments. Hopefully, the above map and following text will simplify things by explaining what all the pieces do and how they work together.

The below is not feature complete, and things will change in the future. But for 99% of you, this should (hopefully) be helpful.

Backend Pool

The backend pool describes a set of machines/services that will work together. The members of a backend pool must be all of the same type from one of these types:

  • IP address/hostname: a common choice in large Azure deployments – you can span peering connections to other VNets
  • Virtual machine: Select a machine from the same VNet as the WAG/WAF
  • VMSS: Virtual machine scale sets in the same VNet as the WAG/WAF
  • App Services: In the same subscription as the WAG/WAF

From here on out, I’ll be using the term “web server” to describe the above.

Note that this are the machines that host your website/service. They will all run the same website/service. And you can configure an optional custom probe to test the availability of the service on these machines.

(Optional) Health Probe

You can create a HTTP/HTTPS probe to do deeper probe tests of a service running on a backend pool. The probe is configured for HTTP or HTTPS and tests a hostname on the web server. You specify a path on the website, a frequency, timeout and allowed number of retries before designating a web site on a web server as being unhealthy and no longer a candidate for load balancing.

HTTP Setting

The HTTP setting configures how the WAG/WAF will talk to the members of the backend pool. It does not configure how clients talk to the site (Listener). So anything you see below here is for configuring WAG/WAF to web server communications (see HTTPS).

  • Control cookie-based affinity for load balancing
  • Configure connection draining when a machine is removed from a backend pool
  • Specify if this is for a HTTP or a HTTPS connection to the webserver. This is for end-to-end encryption.
    • For HTTPS, you will upload a certificate that will match the web servers’ certificate.
  • The port that the web server is listening on.
  • Override the path
  • Override the hostname
  • Use a custom probe

Remember that the above HTTPS setting is not required for website to be published as SSL. It is only required to ensure that encryption continues from the WAG/WAF to the web servers.

Frontend IP Configuration

A WAG/WAF can have public or private frontend IP addresses – the variation depends on if you are using V1 (you have a choice on the mix) or V2 (you must use public and private). The public front end is a single public IP address used for publishing services publicly. The private frontend is a single virtual network address used for internal service publication, requiring virtual network connectivity (virtual network, VPN, ExpressRoute, etc).

The DNS records for your sites will point at the frontend IP address of the WAG/WAF. You can use third-party or Azure DNS – Azure DNS has the benefit of being hosted in every Azure region and in edge sites around the world so it is faster to resolve names than some DNS hoster with 3 servers in a single continent.

A single frontend can be shared by many sites. http://www.aidanfinn.com, http://www.cloudmechanix.com and http://www.joeeleway.com can all point to the same IP address. The hostname configuration that you have in the Listener will determine what happens to the incoming traffic afterwards.

Listener

A Listener is configured to listen for traffic destined to a particular hostname and port number and forward it, eventually, to the correct backend pool. There are two kinds of listener:

  • Basic: For very simple configurations where a site has exclusive ownership over a port number on one of the frontends. Typically this is for point solutions where a WAG/WAF is dedicated to a service.
  • Multi-Site: A listener shares a frontend configuration with other listeners, and is looking for traffic destined to a specific hostname/port/protocol.

Note that the Listner is where you place the certificate to secure client > WAG/WAF communications. This is known as SSL offloading. If you enable HTTPS you will place the “site certificate” on the WAG/WAF via the Listener. You can optionally re-encrypt traffic to the webserver from the WAG/WAF using the previously discussed HTTP Setting. WAGv2/WAFv2 have a no-support preview to use certs that are securely stored in Key Vault.

The configuration of a basic listener is:

  • Frontend
  • Frontend port
  • HTTP or HTTPS protocol
    • The certificate for securing client > WAG/WAF traffic
  • Optional custom error pages

The multi-site listener is adds an extra configuration: hostname. This is because now the listener is sharing the frontend and is only catching traffic for its website. So if I want 3 websites on my WAG/WAF sharing a frontend, I will have 3 x HTTPS listeners and maybe 3 x HTTP listeners.

Rules

A rule glues together the configuration. A basic rule is pretty easy:

  1. Traffic comes into a Listener
  2. The HTTP Setting determines how to forward that traffic to the backend pool
  3. The Backend Pool lists the web servers that host the site

A path-based rule allows you to extend your site across many backend pools. You might have a set of content for /media on pool1. Therefore all http://www.aidanfinn.com/media content is pulled from that pool1. All video content might be on http://www.aidanfinn.com/video, so you’ll redirect /video to pool2. And so on. And you can have individual HTTP settings for each redirection.

My Tips

  • There’s nothing like actually setting this up at scale to try this out. You will need a few DNS names to be able to work with.
  • Remember to enable the protection mode of WAF. I have audited deployments and found situations where people thought they had Layer-7 security but only had the default “alert-only” configuration of WAFv1.
  • In large environments, don’t forget to ensure that the NSGs protecting any webservers allow traffic in from the WAG/WAF’s subnet into the web servers on the port(s) specified in the HTTP Setting(s). Also ensure that any guest OS firewall is similarly configured.
  • Possibly the biggest issue you will have is with devs not assigning hostnames to websites in their webservers. If you’re using shared WAGs/WAFs you must use multi-site listeners and the websites should be configured with the hostname.
  • And the biggest tip I can give is to work out a naming standard for each of the above components so you know what piece is associated with what site. I can’t share what we’re using at work, but we have some big configurations and they are very easy to troubleshoot because of how we have named things.

Backing Up Azure Firewall

In this post, I will outline how you can back up your Azure Firewall, enabling you to rebuild it in case it is accidentally/maliciously deleted or re-configured by an authorized person.

With the Azure Firewall adding new features, we should expect more customers to start using it. And if you are using it like I do with my customers, it’s the centre of everything and it can quickly contain a lot of collections/rules which took a long time to write.

Wait – what new features? Obviously, Threat Detection (using the MS security graph) is killer, but support for up to 100 public IP addresses was announced and is imminent, availability zones are there now for this mission critical service, application rule FQDN support was added for SQL databases, and HD Insight tags are in preview.

So back on topic: how do I backup Azure Firewall? It’s actually pretty simple. You will need to retrieve your firewall’s resource ID:

$AzureFirewallId = (Get-AzFirewall -Name "MyFirewall" -ResourceGroupName "MyVnetRg").id

Then you will export a JSON copy of the firewall:

$BackupFileName = ".\MyFirewallBackup.json"
Export-AzResourceGroup -ResourceGroupName "MyVnetRg" -Resource $AzureFirewallId -SkipAllParameterization -Path $BackupFileName

And that’s the guts of it! To do a restore you simply redeploy the JSON file to the resource group:

New-AzResourceGroupDeployment -name "FirewallRestoreJob" -ResourceGroupName "MyVnetRg" -TemplateFile ".\MyFirewallBackup.json"

I’ve tested a delete and restore and it works. The magic here is using -SkipAllParameterization in the resource export to make the JSON file recreate exactly what was lost at the time of the backup/export.

If you wanted to get clever you could wrap up the backup cmdlets in an Azure Automation script. Add some lines to copy the alter the backup file name (date/time), and copy the backup to blob storage in a GPv2 storage account (with Lifecycle Management for automatic blob tiering and a protection policy to prevent deletion). And then you would schedule to the automation to run every day.

Do Not Enable Azure Storage Account Firewall – IaaS

If you read through the security recommendations in Azure Security Center, you do get given out to a lot. A lot of it makes no sense if you understand Azure and the recommendations. One that appeared to make sense was to enable the relatively new firewall in Azure Storage:

  • Only allow trusted subnets – nice idea to limit the attack surface on the storage account in conjunction with service endpoints.
  • Allow “trusted Microsoft services” to access the storage account (on by default).

Note: A storage account can only be connected if you know one of the really long random access keys.

But if you do enable this firewall in an Azure deployment, things will break:

  • Boot Diagnostics: Does not know how to write to a secured storage account, even with firewall rules and service endpoints enabled.
  • Serial Console Access: Requires Boot Diagnostics to be working so that’s dead too.
  • NSG Flow Logs/Traffic Analytics: Another feature that doesn’t understand a secured storage account, even with “trusted Microsoft services” marked as enabled (default).

And there might be more!

So you have to aks yourself – do you want maximum security or a usable & manageable system? Storage account firewalls are pretty new – we didn’t need them a few months ago. So we can drop that feature, and maybe use the new Advanced Threat Protection for storage accounts feature instead?

It’s a pit that some joined-up thinking and integration testing weren’t done here.

Reasons To Use A Third Party Firewall In Azure

In this post, I will go through some of the reasons that one might use to choose a third-party firewall network virtualization appliance (NVA) in Azure instead of the Azure Firewall.

You can read my take on choosing the Azure Firewall here.

Management

Let’s say you use Firewall X for your on-premises network(s). You have two things:

  • A skillset
  • A management tool

Maybe you want to re-use those? Let’s talk about that reasoning.

You have developed skills over the years to manage and troubleshoot Firewall X – well done! And now you want to bring those skills to Azure. At first, that seems logical. But what if I told you that there was an alternative that had the same functionality as (if not more than) Firewall X, scaled better than Firewall X, and was so easy that I could teach you to fully use it in 15 minutes? Hmm. Those years of skills don’t really make much sense now, do they?

Centralized management – I’ll give you some credit here. Azure Firewall does not have this right now. If I have 4 Azure Firewalls spread around the globe, I do not have 1 management experience. I have identical configuration experiences, but the global configurations have to be replicated – you could script that or use JSON templates. That’s not the same as using a GUI and saying “push this rule to the following 4 firewalls”. But let me ask you this: is this one feature genuinely a business reason to choose a third-party that has an unstable design and limited performance, high availability (if it even has it) or scale-out (most don’t even have this)?

Trust

“You want me to use a MICROSOFT firewall?”. Get over yourself. You’re in Azure and you’re going to be relying on Microsoft security all over the place. Grab your Sony Walkman and return back to whatever decade you came from.

Client VPN

Now we’re talking about something I can genuinely agree with – to a point. Azure sucks at end-user VPN. Azure’s approach is that you should be changing the user experience to using HTTPS (TLS) connectivity to web apps or Citrix/RDS gateways. But time and again, I do encounter customers who want/need VPN. Windows Server mysteriously does not support any of its user connectivity in Azure. And the Azure VPN Gateway has a limited and unsatisfying user VPN experience. So if you want to use a modern “SSL” VPN client with a third-party firewall, I can understand that. BUT, I would limit that appliance to that role. I just cannot stand the mess to get HA working with some of the third party NVAs (if they bother documenting) and the near-absence of scale-out for performance. I would still use Azure Firewall for the firewall 😊

Emotion

And that’s what you have left. And that’s not a valid business reason.

Brand

I’ve done a good bit of reading. So far the only brand of third-party NVA that I would consider myself for an edge/central firewall deployment is Palo Alto – but I’d rather use Azure Firewall over it anyway! All of the third-party solutions are compromised in some way:

  • Don’t do active-active clustering (scale-out)
  • Don’t even offer HA!
  • Have hack solutions (“we’ll edit your route tables for you”) for failover that you know will do more damage than an outage
  • Their documentation pure stinks

How to Troubleshoot Azure Routing?

This post will explain how routing works in Microsoft Azure, and how to troubleshoot your routing issues with Route Tables, BGP, and User-Defined Routes in your virtual network (VNet) subnets and virtual (firewall) appliances/Azure Firewall.

Software-Defined Networking

Right now, you need to forget VLANs, and how routers, bridges, routing switches, and all that crap works in the physical network. Some theory is good, but the practice … that dies here.

Azure networking is software-defined (VXLAN). When a VM sends a packet out to the network, the Azure Fabric takes over as soon as the packet hits the virtual NIC. That same concept extends to any virtual network-capable Azure service. From your point of view, a memory copy happens from source NIC to destination NIC. Yes; under the covers there is an Azure backbone with a “more physical” implementation but that is irrelevant because you have no influence over it.

So always keep this in mind: network transport in Azure is basically a memory copy. We can, however, influence the routing of that memory copy by adding hops to it.

Understand the Basics

When you create a VNet, it will have 1 or more subnets. By default, each subnet will have system routes. The first ones are simple, and I’ll make it even more simple:

  • Route directly via the default gateway to the destination if it’s in the same supernet, e.g. 10.0.0.0/8
  • Route directly to Internet if it’s in 0.0.0.0/0

By the way, the only way to see system routes is to open a NIC in the subnet, and click Effective Routes under Support & Troubleshooting. I have asked that this is revealed in a subnet – not all VNet-connected services have NICs!

And also, by the way, you cannot ping the subnet default gateway because it is not an appliance; it is a software-defined function that is there to keep the guest OS sane … and probably for us too 😊

When you peer a VNet with another VNet, you do a few things, including:

  • Instructing VXLAN to extend the plumbing of between the peered VNets
  • Extending the “VirtualNetwork” NSG rule security tag to include the peered neighbour
  • Create a new system route for peering.

The result is that VMs in VNet1 will send packets directly to VMs in VNet2 as if they were in the same VNet.

When you create a VNet gateway (let’s leave BGP for later) and create a load network connection, you create another (set of) system routes for the virtual network gateway. The local address space(s) will be added as destinations that are tunnelled via the gateway. The result is that packets to/from the on-prem network will route directly through the gateway … even across a peered connection if you have set up the hub/spoke peering connections correctly.

Let’s add BGP to the mix. If I enable ExpressRoute or a BGP-VPN, then my on-prem network will advertise routes to my gateway. These routes will be added to my existing subnets in the gateway’s VNet. The result is that the VNet is told to route to those advertised destinations via the gateway (VPN or ExpressRoute).

If I have peered the gateway’s VNet with other VNets, the default behaviour is that the BGP routes will propagate out. That means that the peered VNets learn about the on-premises destinations that have been advertised to the gateway, and thus know to route to those destinations via the gateway.

And let’s stop there for a moment.

Route Priority

We now have 2 kinds of route in play – there will be a third. Let’s say there is a system route for 172.16.0.0/16 that routes to virtual network. In other words, just “find the destination in this VNet”. Now, let’s say BGP advertises a route from on-premises through the gateway that is also for 172.16.0.0/16.

We have two routes for the 172.16.0.0/16 destination:

  • System
  • BGP

Azure looks at routes that clash like above and deactivates one of them. Azure always ranks BGP above System. So, in our case, the System route for 172.16.0.0/16 will be deactivated and no longer used. The BGP route for 172.16.0.0/16 via the VNet gateway will remain active and will be used.

Specificity

Try saying that word 5 times in a row after 5 drinks!

The most specific route will be chosen. In other words, the route with the best match for your destination is selected by the Azure fabric. Let’s say that I have two active routes:

  1. 16.0.0/16 via X
  2. 16.1.0/24 via Y

Now, let’s say that I want to send a packet to 172.16.1.4. Which route will be chosen? Route A is a 16 bit match (172.16.*.*). Route B is a 24 bit match (172.16.1.*). Route B is a closer match so it is chosen.

Now add a scenario where you want to send a packet to 172.16.2.4. At this point, the only match is Route A. Route B is not a match at all.

This helps explain an interesting thing that can happen in Azure routing. If you create a generic rule for the 0.0.0.0/0 destination it will only impact routing to destinations outside of the virtual network – assuming you are using the private address spaces in your VNet. The subnets have system routes for the 3 private address spaces which will be more specific than 0.0.0.0:

  1. 168.0.0/16
  2. 16.0.0/12
  3. 0.0.0/8
  4. 0.0.0/0

If your VNet address space is 10.1.0.0/16 and you are trying to send a packet from subnet 1 (10.1.1.0/24) to subnet 2 (10.1.2.0/24), then the generic Route D will always be less specific than the system route, Route C.

Route Tables

A route table resource allows us to manage the routing of a subnet. Good practice is that if you need to manage routing then:

  • Create a route table for the subnet
  • Name the route table after the VNet/subnet
  • Only use a route table with 1 subnet

The first thing to know about route tables is that you can control BGP propagation with them. This is especially useful when:

  • You have peered virtual networks using a hub gateway
  • You want to control how packets get to that gateway and the destination.

The default is that BGP propagation is allowed over a peering connection to the spoke. In the route table (Settings > Configuration) you can disable this propagation so the BGP routes are never copied from the hub network (with the VNet gateway) to the peered spoke VNet’s subnets.

The second thing about route tables is that they allow us to create user-defined routes (UDRs).

User-Defined Routes

You can control the flow of packets using user-defined routes. Note that UDRs outrank BGP routes and System Routes:

  1. UDR
  2. BGP routes
  3. System routes

If I have a system or BGO route to get to 192.168.1.0/24 via some unwanted path, I can add a UDR to 192.168.1.0/24 via the desired path. If the two routes are identical destination matches, then my UDR will be active and the BGP/system route will be deactivated.

Troubleshooting Tools

The traditional tool you might have used is TRACERT. I’m sorry, it has some use, but it’s really not much more than PING. In the software defined world, the default gateway isn’t a device with a hop, the peering connection doesn’t have a hop, and TRACERT is not as useful as it would have been on-premises.

The first thing you need is the above knowledge. That really helps with everything else.

Next, make sure your NSGs aren’t the problem, not your routing!

Next is the NIC, if you are dealing with virtual machines. Go to Effective Routes and look at what is listed, what is active and what is not.

Network Watcher has a couple of tools you should also look at:

  • Next Hop: This is a pretty simple tool that tells you the next “appliance” that will process packets on the journey to your destination, based on the actual routing discovered.
  • Connection Troubleshoot: You can send a packet from a source (VM NIC or Application Gateway) to a certain destination. The results will map the path taken and the result.

The tools won’t tell you why a routing plan failed, but with the above information, you can troubleshoot a (desired) network path.

Locking Down Network Access to the Azure Application Gateway/Firewall

In this post, I will explain how you can use a Network Security Group (NSG) to completely lock down network access to the subnet that contains an Azure Web Application Gateway (WAG)/Web Application Firewall (WAF).

The stops are as follows:

  1. Deploy a WAG/WAF to a dedicated subnet.
  2. Create a Network Security Group (NSG) for the subnet.
  3. Associate the NSG with the subnet.
  4. Create an inbound rule to allow TCP 65503-65534 from the Internet service tag to the CIDR address of the WAG/WAF subnet.
  5. Create rules to allow application traffic, such as TCP 443 or TCP 80, from your sources to the CIDR address of the WAG/WAF
  6. Create a low priority (4000) rule to allow any protocol/port from the AzureLoadBlanacer service tag to the CIDR address of the WAG/WAF
  7. Create a rule, with the lowest priority (4096) to Deny All from Any source.

The Scenario

It is easy to stand up a WAG/WAF in Azure and get it up and running. But in the real world, you should lock down network access. In the world of Azure, all network security begins with an NSG. When you deploy WAG/WAF in the real world, you should create an NSG for the WAG/WAF subnet and restrict the traffic to that subnet to what is just required for:

  • Health monitoring of the WAG/WAF
  • Application access from the authorised sources
  • Load balancing of the WAG/WAF instances

Everything else inbound will be blocked.

The NSG

Good NSG practice is as follows:

  1. Tiers of services are placed into their own subnet. Good news – the WAG/WAF requires a dedicated subnet.
  2. You should create an NSG just for the subnet – name the NSG after the VNet-Subnet, and maybe add a prefix or suffix of NSG to the name.

Health Monitoring

Azure will need to communicate with the WAG/WAF to determine the health of the backends – I know that this sounds weird, but it is what it is.

Note: You can view the health of your backend pool by opening the WAG/WAF and browsing to Monitoring > Backend Health. Each backend pool member will be listed here. If you have configured the NSG correctly then the pool member status should be “Healthy”, assuming that they are actually healthy. Otherwise, you will get a warning saying:

Unable to retrieve health status data. Check presence of NSG/UDR blocking access to ports 65503-65534 from Internet to Application Gateway.

OK – so you need to open those ports from “Internet”. Two questions arise:

  • Is this secure? Yes – Microsoft states here that these ports are “are protected (locked down) by Azure certificates. Without proper certificates, external entities, including the customers of those gateways, will not be able to initiate any changes on those endpoints”.
  • What if my WAG/WAF is internal and does not have a public IP address? You will still do this – remember that “Internet” is everything outside the virtual network and peered virtual networks. Azure will communicate with the WAG/WAF via the Azure fabric and you need to allow this communication that comes from an external source.

In my example, my WAF subnet CIDR is 10.0.2.4/24:

Application Traffic

Next, I need to allow application traffic. Remember that the NSG operates at the TCP/UDP level and has no idea of URLs – that’s the job of the WAG/WAF. I will use the NSG to define what TCP ports I am allowing into the WAG/WAF (such as TCP 443) and from what sources.

In my example, the WAF is for internal usage. Clients will connect to applications over a VPN/ExpressRoute connection. Here is a sample rule:

If this was an Internet-facing WAG or WAF, then the source service tag would be Internet. If other services in Azure need to connect to this WAG or WAF, then I would allow traffic from either Virtual Network or specific source CIDRs/addresses.

The Azure Load Balancer

To be honest, this one caught me out until I reasoned what the cause was. My next rule will deny all other traffic to the WAG/WAF subnet. Without this load balancer rule, the client could not connect to the WAG/WAF. That puzzled me, and searches led me nowhere useful. And then I realized:

  • A WAG/WAF is 1+ instances (2+ in v2), each consuming IP addresses in the subnet.
  • They are presented to clients as a single IP.
  • That single IP must be a load balancer
  • That load balancer needs to probe the load balancer’s own backend pool – which are the instance(s) of the WAG/WAF in this case

You might ask: isn’t there a default rule to allow a load balancer probe? Yes, it has priority 65001. But we will be putting in a rule at 4096 to prevent all connections, overriding the 65000 rule that allows everything from VirtualNetwork – which includes all subnets in the virtual network and all peered virtual networks.

The rule is simple enough:

Deny Everything Else

Now we will override the default NSG rules that allow all communications to the subnet from other subnets in the same VNet or peered VNets. This rule should have the lowest possible user-defined priority, which is 4096:

Why am I using the lowest possible priority? This is classic good firewall rule practice. General rules should be low priority, and specific rules should be high priority. The more general, the lower. The more specific, the higher. The most general rule we have in firewalls is “block everything we don’t allow”; in other words, we are creating a white list of exceptions with the previously mentioned rules.

The Results

You should end up with:

  • The health monitoring rule will allow Azure to check your WAG/WAF over a certificate-secured channel.
  • Your application rules will permit specified clients to connect to the WAG/WAF, via a hidden load balancer.
  • The load balancer can probe the WAG/WAF and forward client connections.
  • The low priority deny rule will block all other communications.

Job done!

 

Why Choose the Azure Firewall over a Virtual Firewall Appliance?

In this post, I will explain why you should choose Azure Firewall over third-party firewall network virtual appliances (NVAs) from the likes of Cisco, Palo Alto, Check Point, and so on.

Microsoft’s Opinion

Microsoft has a partner-friendly line on Azure Firewall versus third-parties. Microsoft says that third-party solutions offer more than Azure Firewall. If you want you can use them side-by-side.

Now that’s out of the way, let me be blunt … like I’d be anything else! 😊

The NVA Promise

At their base, a firewall blocks or allows TCP/UDP/etc and does NAT. Some firewalls offer a “security bundle” of extra features such as:

  • Malware scanning based on network patterns
  • Download scanning, including zero-days (detonation chamber)
  • Browser URL logging & filtering

But those cool things either make no sense in Azure or are just not available from the NVA vendors in their cloud appliances. So what you are left with is central logging and filtering.

Documentation

With the exception of Palo Alto (their whitepaper for Azure is very good – not perfect) and maybe Check Point, the vendors have pretty awful documentation. I’ve been reading a certain data centre mainstay’s documents this week and they are incomplete and rubbish.

Understanding of Azure

It’s quite clear that some of the vendors are clueless about The Cloud and/or Azure. Every single vendor has written docs about deploying everything into a single VNet – if you can afford NVAs then you are not putting all your VMs into a single VNet (see hub & spoke VNet peering). Some have never heard of availability zones – if you can afford NVAs then you want as high an SLA as you can get. Most do not offer scale-out (active/active clusters) – so a single VM becomes your bottleneck on VM performance (3000 Mbps in a D3_v2). Some don’t even support highly available firewall clusters – so a single VM becomes the single point of failure in your entire cloud network! And their lack of documentation or understanding of VNet peering or route tables in a large cloud deployment is laughable.

The Comparison

So, what I’m getting at is that the third-party NVAs suck. Azure Firewall isn’t perfect either, but it’s a true cloud platform service and it is improving fast – just last night Microsoft announced Threat Intelligence-Based Filtering and Service Tags Filtering (this appeared recently). I know more things are on the way too 😊

Here is my breakdown of how Azure Firewall stacks up against firewall NVAs:

Azure Firewall NVA
Deployment Platform Linux VM + Software
Licensing Consumption: instance + GB Linux VM + Software
Scaling Automatic Add VMs + Software
Ownership Set & monitor Manage VM / OS / Software
Layer -7 Logging & filtering Potentially* deep inspection
Networking 1 subnet & PIP 1+ subnets & 1 PIP
Complexity Simple Difficult

I know: you laugh when you hear “Microsoft” and “Firewall” in the same sentence. You think of ISA Server. Azure Firewall is different. This is baked into the fabric of Azure, the strategic future of Microsoft. It is already rapidly improving, and it does more than the third parties.

Heck, what does the third-party offer compared to NSGs? NSGs filter TCP/UDP, they can log to a storage account, you can centrally log using Event Hubs, and does advanced reporting/analysis using NSG Flo Logs with Azure Monitor Logs (Log Analytics). Azure Firewall takes that another step with a hub deployment, an understanding of HTTP/S, and is now using machine learning for dynamic threat prevention!

My Opinion

Some people will always prefer a non-Microsoft firewall. But my counter would be, what are you getting that is superior – really? With Azure Firewall, I create a firewall, set my rules, configure my logging, and I’m done. Azure Firewall scales and it is highly available. Logging can be done to storage accounts, event hubs (SIEM), and Azure Monitor Logs. And here’s the best bit … it is SIMPLE to deploy and there is almost no cost of ownership. Compare that to some of the HACK solutions from the NVA vendors and you’d laugh.

The Azure Firewall was designed for The Cloud. It was designed for the way that Azure works. And it was designed for how we should use The Cloud … at scale. And that scale isn’t just about Mbps, but in terms of backend services and networks. From what I have seen so far, the same cannot be said for firewall NVAs. For me, the decision is easy: Azure Firewall. Every time.

Azure-to-Azure Site Recovery Fails – Connection Cannot Be Established

In this post, I’ll explain how to fix the following errors when you attempt to replicate an Azure virtual machine from one Azure Region to another:

Error 151072: Connection cannot be established to Azure Site Recovery service endpoints.

And:

Error 539: The requested action couldn’t be performed by the ‘A2A’ Replication Provider.

The Cause

A2ASR (the abbreviation of the ASR service for Azure VMs) uses an extension (guest OS agent) called the Mobility Service to migrate disk contents from a source virtual machine to a target (secondary) region (or DR site). The Mobility Service is using the networking of the virtual machine to talk the ASR endpoints in the secondary region. That traffic is therefore going over the NIC and virtual network of the VM, and then to the target region via the Azure backbone.

if you have restricted outbound traffic for your virtual machines, then you might have blocked this traffic:

  • Third party firewall appliances
  • Using Network Security Groups (NSGs), as I documented here

The Fix

Woops! Don’t worry, you’ve already created exceptions to allow your virtual machine to boot up. You can create more exceptions to allow the virtual machines to talk to the ASR endpoints (see the below screenshot). Let’s imagine that I am replicating from North Europe to West Europe.

 

image

I’ll need at least one set of rules, enabling outbound traffic from my VNet/NICs in the source region, North Europe, to the two IP addresses of the target region, West Europe.

I will also have to enable inbound traffic from my target region, West Europe, to my destination region, North Europe. Why? Isn’t all my traffic going from North Europe to West Europe? That’s true – now. But if you failover to West Europe, you will need to reverse replication afterwards, so you might as well get things right now.

A Script

It all looks messy at first. It probably isn’t too bad. But if you’d like to deploy a canned script to update NSGs, you can. Microsoft has shared a script that you can run. You will need a few pieces of information:

  • NSG name
  • NSG resource group name
  • Subscription ID
  • Source region
  • Target region

Run the script (it will prompt you to log in) from source to target, and then reverse the details, treating the target as the source, and vice versa with the NSG(s) in the DR site.

Where’s the Service Tags?

Storage accounts and Azure SQL all have service accounts, but ASR does not. I believe that ASR should have service tags to avoid all of this IP messiness. If you agree, vote here, or forever stay quiet on the subject.

Was This Kind of Information Useful?

If you found this information useful, then imagine what 2 days of training might mean to you. I’m delivering a 2-day course in Amsterdam on April 19-20, teaching newbies and experienced Azure admins about Azure Infrastructure. There’ll be lots of in-depth information, covering the foundations, best practices, troubleshooting, and advanced configurations. You can learn more here.

Azure VMs–Block Outbound Traffic to the Internet (Updated)

In theory, it was possible to deny all outbound traffic to the Internet from an Azure VM. In theory, I can also place a loaded gun to my head, but my doctor disapproves of that.

Here’s what would happen:

  • You created an outbound rule to Deny all traffic to a service tag (location) called Internet.
  • The VM worked fine … for a while.
  • The VM was rebooted, maybe for a guest OS patch cycle.
  • The VM would not reboot.
  • Your boss screamed at you, if you were lucky.

The problem is that Azure included all Azure services under the service tag of “Internet”. And Azure VMs need to talk to Azure to boot up – to be specific, they need to talk to Azure Storage if the IaaSDiagnostics (Azure Performance Diagnostics) extension is configured. If a VM can’t talk to that storage account, the VM will fail to boot. There was a scripted workaround, but it was far from pretty.

Recently Microsoft made a Network Security Group service tags generally available. Service tags take those old locations and expand them to more than just Virtual Network, Load Balancer (probe), and Internet. Now you can specify Azure storage (storage account) and Azure SQL services, globally and locally (a specific region).

image

So for example, I can let a VM connect (Azure) Storage globally, in West Europe, or to connect to Azure SQL in North Europe. Now we can block outbound access to the Internet, but still allow access to Azure storage in the same region for diagnostics & metrics.

image

I’ve tested, and yes, my VM rebooted Smile

Was This Post Useful?

If you found this information useful, then imagine what 2 days of training might mean to you. I’m delivering a 2-day course in Amsterdam on April 19-20, teaching newbies and experienced Azure admins about Azure Infrastructure. There’ll be lots of in-depth information, covering the foundations, best practices, troubleshooting, and advanced configurations. You can learn more here.