Appliance | Aidan Finn, IT Pro

Management

Let’s say you use Firewall X for your on-premises network(s). You have two things:

A skillset

A management tool

Maybe you want to re-use those? Let’s talk about that reasoning.

You have developed skills over the years to manage and troubleshoot Firewall X – well done! And now you want to bring those skills to Azure. At first, that seems logical. But what if I told you that there was an alternative that had the same functionality as (if not more than) Firewall X, scaled better than Firewall X, and was so easy that I could teach you to fully use it in 15 minutes? Hmm. Those years of skills don’t really make much sense now, do they?

Centralized management – I’ll give you some credit here. Azure Firewall does not have this right now. If I have 4 Azure Firewalls spread around the globe, I do not have 1 management experience. I have identical configuration experiences, but the global configurations have to be replicated – you could script that or use JSON templates. That’s not the same as using a GUI and saying “push this rule to the following 4 firewalls”. But let me ask you this: is this one feature genuinely a business reason to choose a third-party that has an unstable design and limited performance, high availability (if it even has it) or scale-out (most don’t even have this)?

Client VPN

Now we’re talking about something I can genuinely agree with – to a point. Azure sucks at end-user VPN. Azure’s approach is that you should be changing the user experience to using HTTPS (TLS) connectivity to web apps or Citrix/RDS gateways. But time and again, I do encounter customers who want/need VPN. Windows Server mysteriously does not support any of its user connectivity in Azure. And the Azure VPN Gateway has a limited and unsatisfying user VPN experience. So if you want to use a modern “SSL” VPN client with a third-party firewall, I can understand that. BUT, I would limit that appliance to that role. I just cannot stand the mess to get HA working with some of the third party NVAs (if they bother documenting) and the near-absence of scale-out for performance. I would still use Azure Firewall for the firewall 😊

Brand

I’ve done a good bit of reading. So far the only brand of third-party NVA that I would consider myself for an edge/central firewall deployment is Palo Alto – but I’d rather use Azure Firewall over it anyway! All of the third-party solutions are compromised in some way:

Don’t do active-active clustering (scale-out)

Don’t even offer HA!

Have hack solutions (“we’ll edit your route tables for you”) for failover that you know will do more damage than an outage

Their documentation pure stinks

This post will explain how routing works in Microsoft Azure, and how to troubleshoot your routing issues with Route Tables, BGP, and User-Defined Routes in your virtual network (VNet) subnets and virtual (firewall) appliances/Azure Firewall.

Software-Defined Networking

Right now, you need to forget VLANs, and how routers, bridges, routing switches, and all that crap works in the physical network. Some theory is good, but the practice … that dies here.

Azure networking is software-defined (VXLAN). When a VM sends a packet out to the network, the Azure Fabric takes over as soon as the packet hits the virtual NIC. That same concept extends to any virtual network-capable Azure service. From your point of view, a memory copy happens from source NIC to destination NIC. Yes; under the covers there is an Azure backbone with a “more physical” implementation but that is irrelevant because you have no influence over it.

So always keep this in mind: network transport in Azure is basically a memory copy. We can, however, influence the routing of that memory copy by adding hops to it.

Understand the Basics

When you create a VNet, it will have 1 or more subnets. By default, each subnet will have system routes. The first ones are simple, and I’ll make it even more simple:

Route directly via the default gateway to the destination if it’s in the same supernet, e.g. 10.0.0.0/8
Route directly to Internet if it’s in 0.0.0.0/0

By the way, the only way to see system routes is to open a NIC in the subnet, and click Effective Routes under Support & Troubleshooting. I have asked that this is revealed in a subnet – not all VNet-connected services have NICs!

And also, by the way, you cannot ping the subnet default gateway because it is not an appliance; it is a software-defined function that is there to keep the guest OS sane … and probably for us too 😊

When you peer a VNet with another VNet, you do a few things, including:

Instructing VXLAN to extend the plumbing of between the peered VNets
Extending the “VirtualNetwork” NSG rule security tag to include the peered neighbour
Create a new system route for peering.

The result is that VMs in VNet1 will send packets directly to VMs in VNet2 as if they were in the same VNet.

When you create a VNet gateway (let’s leave BGP for later) and create a load network connection, you create another (set of) system routes for the virtual network gateway. The local address space(s) will be added as destinations that are tunnelled via the gateway. The result is that packets to/from the on-prem network will route directly through the gateway … even across a peered connection if you have set up the hub/spoke peering connections correctly.

Let’s add BGP to the mix. If I enable ExpressRoute or a BGP-VPN, then my on-prem network will advertise routes to my gateway. These routes will be added to my existing subnets in the gateway’s VNet. The result is that the VNet is told to route to those advertised destinations via the gateway (VPN or ExpressRoute).

If I have peered the gateway’s VNet with other VNets, the default behaviour is that the BGP routes will propagate out. That means that the peered VNets learn about the on-premises destinations that have been advertised to the gateway, and thus know to route to those destinations via the gateway.

And let’s stop there for a moment.

Route Priority

We now have 2 kinds of route in play – there will be a third. Let’s say there is a system route for 172.16.0.0/16 that routes to virtual network. In other words, just “find the destination in this VNet”. Now, let’s say BGP advertises a route from on-premises through the gateway that is also for 172.16.0.0/16.

We have two routes for the 172.16.0.0/16 destination:

System
BGP

Azure looks at routes that clash like above and deactivates one of them. Azure always ranks BGP above System. So, in our case, the System route for 172.16.0.0/16 will be deactivated and no longer used. The BGP route for 172.16.0.0/16 via the VNet gateway will remain active and will be used.

Specificity

Try saying that word 5 times in a row after 5 drinks!

The most specific route will be chosen. In other words, the route with the best match for your destination is selected by the Azure fabric. Let’s say that I have two active routes:

16.0.0/16 via X
16.1.0/24 via Y

Now, let’s say that I want to send a packet to 172.16.1.4. Which route will be chosen? Route A is a 16 bit match (172.16.*.*). Route B is a 24 bit match (172.16.1.*). Route B is a closer match so it is chosen.

Now add a scenario where you want to send a packet to 172.16.2.4. At this point, the only match is Route A. Route B is not a match at all.

This helps explain an interesting thing that can happen in Azure routing. If you create a generic rule for the 0.0.0.0/0 destination it will only impact routing to destinations outside of the virtual network – assuming you are using the private address spaces in your VNet. The subnets have system routes for the 3 private address spaces which will be more specific than 0.0.0.0:

168.0.0/16
16.0.0/12
0.0.0/8
0.0.0/0

If your VNet address space is 10.1.0.0/16 and you are trying to send a packet from subnet 1 (10.1.1.0/24) to subnet 2 (10.1.2.0/24), then the generic Route D will always be less specific than the system route, Route C.

Route Tables

A route table resource allows us to manage the routing of a subnet. Good practice is that if you need to manage routing then:

Create a route table for the subnet
Name the route table after the VNet/subnet
Only use a route table with 1 subnet

The first thing to know about route tables is that you can control BGP propagation with them. This is especially useful when:

You have peered virtual networks using a hub gateway
You want to control how packets get to that gateway and the destination.

The default is that BGP propagation is allowed over a peering connection to the spoke. In the route table (Settings > Configuration) you can disable this propagation so the BGP routes are never copied from the hub network (with the VNet gateway) to the peered spoke VNet’s subnets.

The second thing about route tables is that they allow us to create user-defined routes (UDRs).

User-Defined Routes

You can control the flow of packets using user-defined routes. Note that UDRs outrank BGP routes and System Routes:

UDR
BGP routes
System routes

If I have a system or BGO route to get to 192.168.1.0/24 via some unwanted path, I can add a UDR to 192.168.1.0/24 via the desired path. If the two routes are identical destination matches, then my UDR will be active and the BGP/system route will be deactivated.

Troubleshooting Tools

The traditional tool you might have used is TRACERT. I’m sorry, it has some use, but it’s really not much more than PING. In the software defined world, the default gateway isn’t a device with a hop, the peering connection doesn’t have a hop, and TRACERT is not as useful as it would have been on-premises.

The first thing you need is the above knowledge. That really helps with everything else.

Next, make sure your NSGs aren’t the problem, not your routing!

Next is the NIC, if you are dealing with virtual machines. Go to Effective Routes and look at what is listed, what is active and what is not.

Network Watcher has a couple of tools you should also look at:

Next Hop: This is a pretty simple tool that tells you the next “appliance” that will process packets on the journey to your destination, based on the actual routing discovered.
Connection Troubleshoot: You can send a packet from a source (VM NIC or Application Gateway) to a certain destination. The results will map the path taken and the result.

The tools won’t tell you why a routing plan failed, but with the above information, you can troubleshoot a (desired) network path.

Tag: Appliance

Reasons To Use A Third Party Firewall In Azure