Default Outbound Access For VMs In Azure Will Be Retired

Microsoft has announced that the default route, an implicit public IP address, is being deprecated 30 September 2025.

Background

Let’s define “Internet” for the purposes of this post. The Internet includes:

  • The actual Internet.
  • Azure services, such as Azure SQL or Azure’s KMS for Windows VMs, that are shared with a public endpoint (IP address).

We have had ways to access those services, including:

  • Public IP address associated with a NIC of the virtual machine
  • Load Balancer with a public IP address with the virtual machine being a backend
  • A NAT Gateway
  • An appliance, such as a firewall NVA or Azure firewall, being defined as the next hop to Internet prefixes, such as 0.00.0/0

If a virtual machine is deployed without having any of the above, it still needs to reach the Internet to do things like:

  • Activate a Windows license against KVM
  • Download packages for Ubuntu
  • Use Azure services such as Key Vault, My SQL for Azure SQL, or storage accounts (diagnostics settings)

For that reason, all Azure virtual machines are able to reach the Internet using an implied public IP address. This is an address that is randomly assigned to SNAT the connection out from the virtual machine to the Internet. That address:

  • Is random and can change
  • Offers no control or security

Modern Threats

There are two things that we should have been designing networks to stop for years:

  • Malware command and control
  • Data exfiltration

The modern hack is a clever and gradual process. Ransomware is not some dumb bot that gets onto your network and goes wild. Some of the recent variants are manually controlled. The malware gets onto the network and attempts to call home to a “machine” on the Internet. From there, the controllers can explore the network and plan their attack. This is the command and control. This attempt to “call home” should be blocked by network/security designs that block outbound access to the Internet by default, opening only connections that are required for workloads to function.

The controller will discover more vulnerabilities and download more software, taking further advantage of vulnerable network/security designs. Backups are targeted for attack first, data is stolen, and systems are crippled and encrypted.

The data theft, or exfiltration, is to an IP address that a modern network/security design would block.

So you can see, that a network design where an implied public IP address is used is not a good practice. This is a primary consideration for Microsoft in making its decision to end the future use of implied public IP addresses.

What Is Happening?

On September 30th, all future virtual machines will no longer be able to use an implied public IP address. Existing virtual machines will be unaffected – but I want to drill into that because it’s not as simple as one might think.

A virtual machine is a resource in Azure. It’s not some disks. It’s not your concept of “I have something called X” that is a virtual machine. It’s a resource that exists. At some point, that resource might be removed. At that point, the virtual machine no longer exists, even if you recreate it with the exact same disks and name.

So keep in mind:

  • Virtual networks with existing VMs: The existing VMs are unaffected, but new VMs in the VNet will be affected and won’t work.
  • Scale-out: Let’s say you have a big workload with dozens of VMs with no public IP usage. You add more VMs and they don’t work – it’s because they don’t have an implied IP address, unlike their older siblings.
  • Restore from backup: You restore a VM to create a new VM. The new VM will not have an implied public IP address.

Is This a Money Grab?

No, this is not a money grab. This is an attempt by Microsoft to correct a “wrong” (it was done to be helpful to cloud newcomers) that was done in the original design. Some of the mitigations are quite low-cost, even for small businesses. To be honest, what money could be made here is pennies compared to the much bigger money that is made elsewhere by Azure.

The goal here is to:

  • Be secure by default by controlling egress traffic to limit command & control and data exfiltration.
  • Provide more control over egress flows by selecting the appliance/IP address that is used.
  • Enable more visibility over public IP addresses, for example, what public address should I share with a partner for their firewall rules?
  • Drive better networking and security architectures by default.

What Is Your Mitigation?

There are several paths that you can choose.

  1. Assign a public IP address to a virtual machine: This is the lowest cost option but offers no egress security. It can get quite messy if multiple virtual machines require public IP addresses. Rate this as “better than nothing”.
  2. Use a NAT Gateway: This allows a single IP address (or a range from an Azure Public IP Address Prefix) to be shared across an entire subnet. Note that NAT Gateway gets messy if you span availability zones, requiring disruptive VNet and workload redesign. Again this is not a security option.
  3. Use a next hop: You can use an appliance (virtual machine or Marketplace network virtual appliance) or the Azure Firewall as a next hop to the Internet (0.0.0.0/0) or specific Internet IP prefixes. This is a security option – a firewall can block unwanted egress traffic. If you are budget-conscious, then consider Azure Firewall Basic. No matter what firewall/appliance you choose, there will be some subnet/VNet redesign and changes required to routing, which could affect VNet-integrated PaaS services such as API Management Premium.

September 2025 is a long time away. But you have options to consider and potentially some network redesign work to do. Don’t sit around – start working.

In Summary

The implied route to the Internet for Azure VMs will stop being available to new VMs on September 30th, 2025. This is not a money grab – you can choose low-cost options to mitigate the effects if you wish. The hope is that you opt to choose better security, either from Microsoft or a partner. The deadline is a long time away. Do not assume that you are not affected – one day you will expand services or restore a VM from backup and be affected. So get started on your research & planning.

Azure Infrastructure Announcements – August 2023

This post brings you a summary of the infrastructure announcements from Azure that were made during August 2023. There are lots of announcements from Storage and a few interesting notes for VMs, networking, and ASR.

Storage

Azure Managed Lustre: not your grandparents’ parallel file system

With a few clicks of a web interface or an Azure Resource Manager template, AMLFS lets you provision an all-flash Lustre file system in minutes. What’s different is that this Lustre file system is all yours. If someone else in Azure is running a job that creates a million files, you won’t ever know it because your Lustre servers and SSDs are exclusively yours.

Massively scaled and high performance file systems for HPC workloads.

General availability | Azure NetApp Files: SMB Continuous Availability (CA) shares

To enhance resiliency during storage service maintenance operations, SMB volumes used by Citrix App Layering, FSLogix user profile containers and Microsoft SQL Server on Microsoft Windows Server can be enabled with Continuous Availability

SMB Transparent Failover means that clients should not notice maintenance operations.

Public preview: Azure Storage Mover support for SMB and Azure Files

Storage Mover is a fully managed migration service that enables you to migrate on-premises files and folders to Azure Storage while minimizing downtime for your workload. Azure Storage Mover can now migrate your SMB shares to Azure file shares.

To be honest, I’ve not encountered a “replace the file server with Azure Files” scenario yet. Third-party vendors often won’t support it for LOB apps. User data typically ends up in SharePoint/OneDrive. And wouldn’t most Citrix/RDS admins want to start with new profiles?

Generally available: Azure Blob Storage Cold Tier

Azure Blob Storage Cold Tier is now generally available. It is a new online access tier that is the most cost-effective Azure Blob offering for storing infrequently accessed data with long-term retention requirements, while providing instant access. The pricing of the cold tier storage option lies between the cool and archive tiers, and it follows a 90-day early deletion policy. You can seamlessly utilize the cold tier in the same way as the hot and cool tiers.

Cool – Cold. Tell me that isn’t confusing. The scenario is that you want to store data for a long time, but you need it immediately available. Archive requires a 15-hour restore (“rehydration”) that can be accelerated with a charge. Cold is one step up, but not as cost-effective.

Public Preview: Azure NetApp Files Cloud Backup for Virtual Machines

With Cloud Backup for Virtual Machines, you can now create VM consistent snapshot backups of VMs on Azure NetApp Files datastores. The associated virtual appliance installs in the Azure VMware Solution cluster and provides policy-based automated and consistent backup of VMs integrated with Azure NetApp Files snapshot technology for fast backups and restores of VMs, groups of VMs (organized in resource groups) or complete datastores lowering RTO, RPO, and improving total cost of ownership.

General Availability: Incremental snapshots for Premium SSD v2 Disk and Ultra Disk Storage

You can now instantly restore Premium SSD v2 and Ultra Disks from snapshots and attach them to a running VM without waiting for any background copy of data. This new capability allows you to read and write data on disks immediately after creation from snapshots, enabling you to recover your data from accidental deletes or a disaster quickly

I can see third-party backup making use of this.

Azure Elastic SAN updates: Private Endpoints & Shared Volumes

As we approach general availability of Azure Elastic SAN, we continue improving the service and adding features based on your feedback. Today, we are releasing private endpoint support and volume sharing support via SCSI (Small Computer System Interface) Persistent Reservation.

This sounds like the sort of feature maturity one will expect as the service approaches general availability. I wonder what the actual target market is for this service.

Azure Site Recovery

Private Preview – DR for Shared Disks – Azure Site Recovery

We are excited to announce the Private Preview of DR for Azure Shared Disks for workloads running Windows Server Failover Clusters (WSFC) on Azure VMs. Now you can protect, monitor, and recover your WSFC-clusters as a single unit across its DR Lifecycle, while also generating cluster-consistent recovery points – which are consistent across all the disks (including the Shared Disk) of the cluster.

This feature is long overdue for customers using shared virtual hard disks to create failover clusters.

Networking

Public preview: Support for new custom error pages in Application Gateway

In addition to the response codes 403 and 502, the Azure Application Gateway now lets you configure company-branded error pages for more response codes – 400, 405, 408, 500, 503, and 504. You can configure these error pages at a global level to apply to all the listeners on your gateway or individually for each listener. 

These pages can be shared on any publicly accessible URI.

Azure Firewall: New Monitoring and Logging Updates

Notes:

  • (Preview) With the Azure Firewall Resource Health check, you can now view the health status of your Azure Firewall and address service problems that may affect your Azure Firewall resource. Resource Health allows IT teams to receive proactive notifications regarding potential health degradations and recommended mitigation actions for each health event type
  • (Preview) The Azure Firewall Workbook presents a dynamic platform for analyzing Azure Firewall data. Within the Azure portal, you can utilize it to generate visually engaging reports.
  • (GA) The Latency Probe metric is designed to measure the overall latency of Azure Firewall and provide insight into the health of the service. IT administrators can use the metric for monitoring and alerting if there is observable latency and diagnosing if the Azure Firewall is the cause of latency in a network.

Resource health should make for a useful alert, especially when enabling DevSecOps – be aware of the dreaded “out of sync” error. I just tried the workbook in a production system – I noticed a couple of things that I might not have otherwise noticed because they didn’t trigger a human response (yet). The latency probe is interesting – I think it originated from customer network performance scenarios where it was suspected that the firewall was the root cause.

Virtual Machines

Public preview: Azure Mv3 Medium Memory (MM) Virtual Machines

Today we are announcing the public preview of the next generation Mv3 Medium Memory (MM) virtual machine series. Powered by the 4th Generation Intel® Xeon® Scalable Processor and DDR5 DRAM technology, the Mv3 medium memory (MM) virtual machines can scale for SAP workloads from 250GB to 4TB. With Azure Boost, Mv3 MM provides a ~25% improvement in network throughput and up to 1.5X improvement in remote storage throughput over the previous M-series families. 

These machines start at 12 vCPUs and 240 GB RAM, scaling up to 176 vCPUs and 2794 RAM. That should just about be enough to run Teams.

Azure Firewall Basic – For Small/Medium Business & “Branch”

Microsoft has just announced a lower cost SKU of Azure Firewall, Basic, that is aimed at small/medium business but could also play a role in “branch office” deployments in Microsoft Azure.

Standard & Premium

Azure Firewall launched with a Standard SKU several years ago. The Standard SKU offered a lot of features, but some things deemed necessary for security were missing: IDPS and TLS Inspection were top of the list. Microsoft added a Premium SKU that added those features as well as fuller web category inspection and URL filtering (not just FQDN).

However, some customers didn’t adopt Azure Firewall because of the price. A lot of those customers were small-medium businesses (SMBs). Another scenario that might be affected is a “branch office” in an Azure region – a smaller footprint that is closer to clients that isn’t a main deployment.

Launching The Basic SKU

Microsoft has been working on a lower cost SKU for quite a while. The biggest challenge, I think, that they faced was trying to figure out how to balance feature, performance, and availability with price. They know that the target market has a finite budget, but there are necessary feature requirements. Every customer is different, so I guess when face with this conundrum, one needs to satisfy the needs of 80% of customers.

The clues for a new SKU have been publicly visible for quite a while – the ARM reference for Azure Firewall documented that a Basic SKU existed somewhere in Azure (in private preview). Tonight, Microsoft launched the Basic SKU inpublic Preview. A longer blog post adds some details.

Introducing the Azure Firewall

The primary target market for the Basic SKU hasn’t deployed a firewall appliance of any kind in Azure – if they are in Azure then they are most likely only using NSGs for security – which operates only at the transport protocol (TCP, UDP, ICMP) layer in a decentralised way.

The Azure Firewall is a firewall appliance, allowing centralised control. It should be deployed with NSGs and resource firewalls for layered protection, and where there is a zero-trust configuration (deny all by default) in all directions, even inside of a workload.

The Azure Firewall is native to Microsoft Azure – you don’t need a third party license or support contract. It is fully deployable and configured as code (ARM, Bicep, Terraform, Pulumi, etc), making it ideal for DevSecOps. Azure Firewall is much easier to learn than NVAs because the firewall is easily available through an Azure subscription and the training (Microsoft Learn) is publicly available – not hidden behind classic training paywalls. Thanks to the community and a platform model, I expect that more people are learning Azure Firewall than any other kind of firewall today – skills are in short supply so using native tech that is easy to learn and many are learning just makes sense.

Comparing Azure Basic With Standard and Premium

Microsoft helpfully put together a table to compare the 3 SKUs:

Comparing Azure Firewall Basic with Standard and Premium

Another difference with the Basic SKU is that you must deploy the AzureFirewallManagementSubnet in addition to the AzureFirewallSubnet – this additional subnet is often associated with forced tunneling. The result is that the firewall will have a second public IP address that is used only for management tasks.

Pricing

The Basic SKU follows the same price model as the higher SKUs: a base compute cost and a data processing cost. The shared pricing is for the Preview so it is subject to change.

The Basic SKU base compute (deployment) cost is €300.03 per month in West Europe. That’s less than 1/3 of the cost of the Standard SKU at €947.54 per month. The data processing cost for the Basic SKU is higher at €0.068 per GB. However, the amount of data passing through such a firewall deployment will be much lower so it probably will not be a huge add-on.

Preview Deployment Error

At this time, the Basic SKU is in preview. You must enable the preview in your subscription. If you do not do this, your deployment will fail with this error:

“code”: “FirewallPolicyMissingRequiredFeatureAllowBasic”,

“message”: “Subscription ‘someGuid’ is missing required feature ‘Microsoft.Network/AzureFirewallBasic’ for Basic policies.”

Some Interesting Notes

I’ve not had a chance to do much work with the Basic SKU – work is pretty crazy lately. But here are two things to note:

  • A hub & spoke deployment is still recommended, even for SMBs.
  • Availability zones are supported for higher availability.
  • You are forced to use Azure Firewall Manager/Azure Firewall Policy – this is a good thing because newer features are only in the new management plane.

Final Thoughts

The new SKU of Azure Firewall should add new customers to this service. I also expect that larger enterprises will also be interested – not every deployment needs the full blown Standard/Premium deployment but some form of firewall is still required.

Azure Firewall DevSecOps in Azure DevOps

In this post, I will share the details for granting the least-privilege permissions to GitHub action/DevOps pipeline service principals for a DevSecOps continuous deployment of Azure Firewall.

Quick Refresh

I wrote about the design of the solution and shared the code in my post, Enabling DevSecOps with Azure Firewall. There I explained how you could break out the code for the rules of a workload and manage that code in the repo for the workload. Realistically, you would also need to break out the gateway subnet route table user-defined route (legacy VNet-based hub) and the VNet peering connection. All the code for this is shared on GitHub – I did update the repo with some structure and with working DevOps pipelines.

This Update

There were two things I wanted to add to the design:

  • Detailed permissions for the service principal used by the workload DevOps pipeline, limiting the scope of change that is possible in the hub.
  • DevOps pipelines so I could test the above.

The Code

You’ll find 3 folders in the Bicep code now:

  • hub: This deploys a (legacy) VNet-based hub with Azure Firewall.
  • customRoles: 4 Azure custom roles are defined. This should be deployed after the hub.
  • spoke1: This contains the code to deploy a skeleton VNet-based (spoke) workload with updates that are required in the hub to connect the VNet and route ingress on-prem traffic through the firewall.

DevOps Pipelines

The hub and spoke1 folders each contain a folder called .pipelines. There you will find a .yml file to create a DevOps pipeline.

The DevOps pipeline uses Azure CLI tasks to:

  • Select the correct Azure subscription & create the resource group
  • Deploy each .bicep file.

My design uses 1 sub for the hub and 1 sub for the workload. You are not glued to this bu you would need to make modifications to how you configure the service principal permissions (below).

To use the code:

  1. Create a repo in DevOps for (1 repo) hub and for (1 repo) spoke1 and copy in the required code.
  2. Create service principals in Azure AD.
  3. Grant the service principal for hub owner rights to the hub subscription.
  4. Grant the service principal for the spoke owner rights to the spoke subscription.
  5. Create ARM service connections in DevOps settings that use the service principals. Note that the names for these service connections are referred to by azureServiceConnection in the pipeline files.
  6. Update the variables in the pipeline files with subscription IDs.
  7. Create the pipelines using the .yml files in the repos.

Don’t do anything just yet!

Service Principal Permissions

The hub service principal is simple – grant it owner rights to the hub subscription (or resource group).

The workload is where the magic happens with this DevSecOps design. The workload updates the hub suing code in the workload repo that affects the workload:

  • Ingress route from on-prem to the workload in the hub GatewaySubnet.
  • The firewall rules for the workload in the hub Azure Firewall (policy) using a rules collection group.
  • The VNet peering connection between the hub VNet and the workload VNet.

That could be deployed by the workload DevOps pipeline that is authenticated using the workload’s service principal. So that means the workload service principal must have rights over the hub.

The quick solution would be to grant contributor rights over the hub and say “we’ll manage what is done through code reviews”. However, a better practice is to limit what can be done as much as possible. That’s what I have done with the customRoles folder in my GitHub share.

Those custom roles should be modified to change the possible scope to the subscription ID (or even the resource group ID) of the hub deployment. There are 4 custom roles:

  • customRole-ArmValidateActionOperator.json: Adds the CUSTOM – ARM Deployment Operator role, allowing the ARM deployment to be monitored and updated.
  • customRole-PeeringAdmin.json: Adds the CUSTOM – Virtual Network Peering Administrator role, allowing a VNet peering connection to be created from the hub VNet.
  • customRole-RoutesAdmin.json: Adds the CUSTOM – Azure Route Table Routes Administrator role, allowing a route to be added to the GatewaySubnet route table.
  • customRole-RuleCollectionGroupsAdmin.json: Adds the CUSTOM – Azure Firewall Policy Rule Collection Group Administrator role, allowing a rules collection group to be added to an Azure Firewall Policy.

Deploy The Hub

The hub is deployed first – this is required to grant the permissions that are required by the workload’s service principal.

Grant Rights To Workload Service Principals

The service principals for all workloads will be added to an Azure AD group (Workloads Pipeline Service Principals in the above diagram). That group is nested into 4 other AAD security groups:

  • Resource Group ARM Operations: This is granted the CUSTOM – ARM Deployment Operator role on the hub resource group.
  • Hub Firewall Policy: This is granted the CUSTOM – Azure Firewall Policy Rule Collection Group Administrator role on the Azure Firewalll Policy that is associated with the hub Azure Firewall.
  • Hub Routes: This is granted the CUSTOM – Azure Route Table Routes Administrator role on the GattewaySubnet route table.
  • Hub Peering: This is granted the CUSTOM – Virtual Network Peering Administrator role on the hub virtual network.

Deploy The Workload

The workload now has the required permissions to deploy the workload and make modifications in the hub to connect the hub to the outside world.

Enabling DevSecOps with Azure Firewall

In this post, I will share how you can implement DevSecOps with Azure Firewall, with links to a bunch of working Bicep files to deploy the infrastructure-as-code (IaC) templates.

This example uses a “legacy” hub and spoke – one where the hub is VNet-based and not based on Azure Virtual WAN Hub. I’ll try to find some time to work on the code for that one.

The Concept

Hold on, because there’s a bunch of things to understand!

DevSecOps

The DevSecOps methodology is more than just IaC. It’s a combination of people, processes, and technology to enable a fail-fast agile delivery of workloads/applications to the business. I discussed here how DevSecOps can be used to remove the friction of IT to deliver on the promises of the Cloud.

The Azure features that this design is based on are discussed in concept here. The idea is that we want to enable Devs/Ops/Security to manage firewall rules in the workload’s Git repository (repo). This breaks the traditional model where the rules are located in a central location. The important thing is not the location of the rules, but the processes that manage the rules (change control through Git repo pull request reviews) and who (the reviewers, including the architects, firewall admins, security admins, etc).

So what we are doing is taking the firewall rules for the workload and placing them in with the workload’s code. NSG rules are probably already there. Now, we’re putting the Azure Firewall rules for the workload in the workload repo too. This is all made possible thanks to changes that were made to Azure Firewall Policy (Azure Firewall Manager) Rules Collection Groups – I use one Rules Collection Group for each workload and all the rules that enable that workload are placed in that Rules Collection Group. No changes will make it to the trunk branch (deployment action/pipelines look for changes here to trigger a deployment) without approval by all the necessary parties – this means that the firewall admins are still in control, but they don’t necessarily need to write the rules themselves … and the devs/operators might even write the rules, subject to review!

This is the killer reason to choose Azure Firewall over NVAs – the ability to not only deploy the firewall resource, but to manage the entire configuration and rule sets as code, and to break that all out in a controlled way to make the enterprise more agile.

Other Bits

If you’ve read my posts on Azure routing (How to Troubleshoot Azure Routing? and BGP with Microsoft Azure Virtual Networks & Firewalls) then you’ll understand that there’s more going on than just firewall rules. Packets won’t magically flow through your firewall just because it’s in the middle of your diagram!

The spoke or workload will also need to deploy:

  • A peering connection to the hub, enabling connectivity with the hub and the firewall. All traffic leaving the spoke will route through the firewall thanks to a user-defined route in the spoke subnet route table. Peering is a two-way connection. The workload will include some bicep to deploy the spoke-hub and the hub-spoke connections.
  • A route for the GatewaySubnet route table in the hub. This is required to route traffic to the spoke address prefix(es) through the Azure Firewall so on-premises>spoke traffic is correctly inspected and filtered by the firewall.

The IaC

In this section, I’ll explain the code layout and placement.

My Code

You can find my public repo, containing all the Bicep code here. Please feel free to download and use.

The Git Repo Design

You will have two Git repos:

  1. The first repo is for the hub. This repo will contain the code for the hub, including:
    • The hub VNet.
    • The Hub VNet Gateway.
    • The GatewaySubnet Route Table.
    • The Azure Firewall.
    • The Azure Firewall Policy that manages the Azure Firewall.
  2. The second repo is for the spoke. This skeleton example workload contains:

Action/Pipeline Permissions

I have written a more detailed update on this section, which can be found here

Each Git repo needs to authenticate with Azure to deploy/modify resources. Each repo should have a service principal in Azure AD. That service principal will be used to authenticate the deployment, executed by a GitHub action or a DevOps pipeline. You should restrict what rights the service principal will require. I haven’t worked out the exact minimum permissions, but the high-level requirements are documented below:

 

Trunk Branch Protection &  Pull Request

Some of you might be worried now – what’s to stop a developer/operator working on Workload A from accidentally creating rules that affect Workload X?

This is exactly why you implement standard practices on the Git repos:

  • Protect the Trunk branch: This means that no one can just update the version of the code that is deployed to your firewall or hub. If you want to create an updated, you have to create a branch of the trunk, make your edits in that trunk, and submit the changes to be merged into trunk as a pull request.
  • Enable pull request reviews: Select a panel of people that will review changes that are submitted as pull requests to the trunk. In our scenario, this should include the firewall admin(s), security admin(s), network admin(s), and maybe the platform & workload architects.

Now, I can only submit a suggested set of rules (and route/peering) changes that must be approved by the necessary people. I can still create my code without delay, but a change control and rollback process has taken control. Obviously, this means that there should be SLAs on the review/approval process and guidance on pull request, approval, and rejection actions.

And There You Have It

Now you have the design and the Bicep code to enable DevSecOps with Azure Firewall.

Testing Azure Firewall IDPS

In this post, I will show you how to test IDPS in Azure Firewall Premium, including test exploits and how to search the logs for alerts.

Azure Firewall Setup

You are going to need a few things:

  • Ideally a hub and spoke deployment of some kind, with a virtual machine in two different spokes. My lab is Azure Virtual WAN, using a VNet as the “compromised on-premises” and a second VNet as the target.
  • Azure Firewall Premium SKU with logging enabled to a Log Analytics Workspace.
  • Azure Firewall Policy Premium SKU, with IDPS enabled for Alert & Deny.

Make sure that you have firewall rules and NSG rules open to allow your “attacks” – the point of IDPS is to stop traffic on legitimate protocols/ports.

Compromised On-Premises Machine

One can use Kali Linux from the Azure Marketplace but I prefer to work in Windows. So I deployed a Windows Server VM and downloaded/deployed Metasploit Opensource, which is installed into C:\metasploit-framework.

The console that you’ll use to run the commands is C:\metasploit-framework\bin\msfconsole.bat.

If you want to trying something simpler, then all you will need is the normal Windows Command prompt.

The Exploit Test

If you are using Metasploit, in the console, run the following to search for “coldfusion” tests:

search coldfusion

Select a test:

use auxiliary/scanner/http/coldfusion_locale_traversal

Set the RHOST (remote host to target) option:

set RHOST <IP address to target>

Verify that all required options are set:

show options

Execute the test:

run

Otherwise, you can run the following CURL command in Windows Command Prompt for a simpler test to do a web request to your target IP using the well-known Blacksun user agent:

curl -A “BlackSun” <IP address to target>

Check Your Logs

It can take a little time for data to appear in your logs. Give it a few minutes and then run this query in Log Analytics:

AzureDiagnostics | where ResourceType == “AZUREFIREWALLS” | where OperationName == “AzureFirewallIDSLog” | parse msg_s with Protocol ” request from” SourceIP “:” SourcePort ” to ” TargetIP “:” TargetPort “. Action:” Action”. Signature: ” Signature “. IDS:” Reason | project TimeGenerated, Protocol, SourceIP, SourcePort, TargetIP, TargetPort, Action, Signature, Reason | sort by TimeGenerated

That should highlight anything that IDPS alerted on & denied – and can also be useful for creating incidents in Azure Sentinel.

Service Definitions in Azure Firewall

In this post, I will discuss how you can use Rules Collection Groups in Azure Firewall to aggregate your Rules Collections and Rules to be aligned with a service definition or workload definition.

Workload and Service Definitions

In an organised environment, every workload (or service) has a definition. That definition describes all of the components that make up and facilitate the workload. That includes your firewall rules.

So it would make sense if you had a way of grouping firewall rules together if those rules are used to make a workload possible. For example, if I was running an Azure Virtual Desktop pool, I might treat that pool as a workload, document it as a workload, and want all of the rules that make that pool possible to be grouped together and managed as a unit.

Challenges

The challenges I have faced with Azure Firewall and aligning rules along the model of a workload definition have been:

Realisation & Conceptualisation

I’ve always used some kind of workload-based approach but my approach wasn’t perfect. I decided to try to align rules with NSGs, placing inbound rules with the workload that was the destination. But then some workloads, such as Windows Admin Center or ADDS, require more reach, and then you get messy. And what if you have a dozen or more workloads that are atomic units but are also extremely integrated? Where do the rules go?

I’ve come to realise that rules should go with the service that they empower, regardless of destination. What made me think like that? Documentation of workloads for a forced-ad-hoc migration project that I’ve been working on for the last year. We didn’t get the chance to assess and plan in detail, so everything was an on-the-fly discovery. Our method of “place the rule with the target workload” has created very complicated ARM templates for Azure Firewall; defining a workload from a firewall perspective is very hard with that approach.

If we said that “all network rules that empower a workload go with the workload’s network Rules Collection” then things would get a bit better – a bit.

Rules Groups Collections Limitations

It is a year since Firewall Policy became generally available. Just like last year, I had some hours to experiment on Azure Firewall. I tried out the new tier, Rules Collection Groups. A reminder:

  1. Rules > Rules Collections (typed, based on DNAT, Network, or Application)
  2. Rules Collections > Rules Collection Groups

But at the time, the new tech was immature. Mixing Rules Collection types between Rules Groups was a disaster. The advice I got from the product group was “don’t do it, it’s not ready, stick with the default groups for now”.

So I did just that. That means that the DNAT rules collection,  the Network Rules Collection, and the Application Rules Collection for any workload were split into 3 deployments, the default rules group collections for:

  • DNAT
  • Network
  • Application

Each of those deployments could have dozens or hundreds of rules collections for each workload. And when you combine that with my previous approach to rules placement:

  1. It’s a mess
  2. Deploying a rule change required re-deploying all Rules Collections of a type for all workloads in using that type of Rules Collection.

A Better Approach

I have had some time to play and things are better.

Rules Alignment With Workload

I’ve discussed this already – place rules that empower a workload with the workload, not the destination workload.

If Workload X requires TCP 445 access to Workload Y, then I will create a Network Rule to Workload Y on TCP 445 in a Network Rules Collection for Workload X. The result is that all rules that make Workload X function will be in rules collections for Workload X. That makes documentation easier and makes the next step work.

Rules Collection Groups For Workload

This is the big change in Azure from this time last year. I can now create lots of Rules Collection Groups, each with a priority (for processing order). I will create 1 Rules Collection Group per workload. Workload X will get 1 Rules Collection Group.

All rules that make Workload X go into Rules Collection Groups for Workload X. I might have, depending on rules requirements, up to 6 Rules Collection Groups:

  • DNAT-Allow-WorkloadX
  • Network-Allow-WorkloadX
  • Application-Allow-WorkloadX
  • DNAT-Deny-WorkloadX
  • Network-Deny-WorkloadX
  • Application-Deny-WorkloadX

The Rules Collection Group is its own Deployment from an ARM perspective. If I’m managing the firewall as code (I do) then I can have 1 template (or parameters file) that defines the Rules Collection Group (and contained Rules Collections and Rules) for the entire workload and just the workload. Each workload will have its own template or parameter file. A change to a workload definition will affect 1 file only, and require 1 deployment only.

If you want to see an ARM template for deploying one of the workloads in my screenshot, then head on over to my GitHub.

Wrap Up

This approach should leave the firewall much better organised, easier to manage in smaller chunks if using infrastructure-as-code, be easier to document, and more suitable for organisations that like to create/maintain service definitions.

Defending Against Supply Chain Attacks

In this post, I will discuss the concepts of supply chain attacks and some thoughts around defending against them.

What Is A Supply Channel Attack?

The recent ransomware attack through Kaseya made the news but the concept of a supply chain attack isn’t new at all. Without doing any research I can think of two other examples:

  • SolarWinds: In December 2020, attackers used compromised code in SolarWinds monitoring solutions to compromise customers of SolarWinds.
  • RSA: In 2011, the Chinese PLA (or hackers sponsored by them) compromised RSA and used that access to attack customers of RSA.

What is a supply chain attack? It’s pretty hard to break into a network, especially one that has hardened itself. Users can be educated – ok, some will never be educated! Networks can be hardened and micro-segmented. Identity protections such as MFA and threat detection can be put in place.

But there remains a weakness – or several of them. There’s always a way into a network – the third party! Even the most secure network deployments require some kind of monitoring system – something where a piece of software is deployed onto “every VM”.  Or there’s some software vendor that’s deep into your network that has openings all over the place. Those are your threats. If an attacker compromises the software from one of those vendors then they will get into your network during your next update and they will use the existing firewall holes & permissions that are required by the software to probe, spread, and attack.

Protection

You still need to have your first lines of defense, ideally using tools that are designed for protection against advanced persistent threats – not your regular AV package, dumby:

  1. Identity
  2. Email
  3. Firewall
  4. Backup with isolated offline storage protected by MFA

That’s a start, but, but a supply chain attack bypasses all that by using existing channels to enter your network as if it is from a trusted source – because the attack is embedded in the code from a trusted source.

Micro-Segmentation

The first step should be micro-segmentation (AKA multi-segementation). No two nodes on your network should be able to communicate unless:

  1. They have to
  2. They are restricted to the required directions, protocols, and ports.
  3. That traffic passes through a firewall – and ideally several firewalls.

In Microsoft Azure, that means using:

  • A central firewall, in the form of a network firewall and/or web application firewall (Azure or NVA). This firewall controls connections between the outside world and your workloads, between your workloads, and importantly from your workloads to the outside world (prevents malware from talking to its human controller).
  • Network Security Groups at the subnet level that protect the subnet and even isolate nodes inside the subnet (use a custom Deny All rule because the default Deny All rule is useless when you understand the logic of how it works).
  • Resource firewalls – that’s the guest OS firewall and Azure resource firewalls.

If you have a Windows ADDS domain, use Group Policy to force the use of Windows Firewall – lazy admins and those very same vendors that will be the channel of attack will be the first to attempt to disable the firewall on machines that they are working on.

For Azure resources, consider the use of Azure Policy to force/audit the use of the firewalls in your resources and a default route to 0.0.0.0/0 via your central firewall.

An infrastructure-as-code approach to the central firewall (Azure Firewall) and NSGs brings documentation, change control, and rollback to network security.

Security Monitoring

This is where most organisations fail, and even where IT security officers really don’t get it.

Syslog is not security monitoring. Your AV is not security monitoring. You need something bigger, that is automated, and can filter through the noise – I regularly use the term “be your Neo to read the Matrix”. That’s because even in a small network, there is a lot of noise. Something needs to filter through that noise and identity the threats.

For example, there’s a lot of TCP 445 connection attempts coming from one IP address. Or there are lots of failed attempts to sign in as a user from one IP address. Or there are lots of failed connections logged by NSG rules. Or even better – all of the above. These are the sorts of things that malware that is attempting to spread will do. This is the sort of work that Azure Sentinel is perfect for – Sentinel connects to many data sources, pulls that data to a central place where complex queries can be run to look for threats that a human won’t be able to do. Threats can create incidents, incidents can trigger automated flows to eliminate the noise, and the remaining incidents can create alerts that humans will act upon.

But some malware is clever and not so noisy. The malware that hit the HSE (the Irish national health service) uses a lot of manual control to quietly spread over a very long time. Restricting outbound access to the Internet to just the required connections for business needs will cripple this control mechanism. But there’s still an automated element to this malware.

Other things to implement in Azure will include:

  • IDPS: An intrusion detection & prevention in the firewall, for example Azure Firewall Premium. When known malware/attack flows pass through the firewall, the firewall can log an alert or alert/deny the flows.
  • Security Center: Enabling Security Center “Azure Defender” (the tier previously known as the Azure Security Center Standard) provides you with oodles of new features, including some endpoint protections that are very confusingly packaged and licensed by Microsoft.

Managed Services Providers

MSPs are a part of the supply chain for their customers. MSP staff typically have credentials that allow them into many customer networks/services. That makes the identities of those staff very valuable.

A managed service provider should be a leader in identity security process, tooling, and governance. In the Microsoft world, that means using Azure AD Premium with MFA enabled for all staff. In the Azure world, Lighthouse should be used to gain access to customers’ cloud implementations. And that access should be zero-trust, powered by Privileged Identity Management (PIM).

Oh Cr@p!

These attackers are not script kiddies. They are professional organisations with big budgets, very skilled programmers and operators, and a lot of time and will. They know that with some persistent effort targeting a vendor, they can enter a lot of networks with ease. Hitting a systems management company, or more scarily, a security vendor, reaps BIG rewards because we invest in these products to secure our entire networks. The other big worry is those vendors that are deeply embedded with certain verticals such as finance or government. Imagine a vendor that is in every branch of a national government – one successful attack could bring down that entire government after  a wave of upgrades! Or hitting a well known payment vendor could open up every bank in the EU.

Enable FQDN-Based Network Rules In Azure Firewall

In this post, I will discuss how the DNS features, DNS Servers and DNS Proxy, can be used to enable FQDN-based rules in the Azure Firewall.

Can’t Azure Firewall Already Do FQDN-based Rules?

Yes – and no. One of the rules types is Application Rules, which control outbound access to services which can be based on a URI (a DNS name) for HTTP/S and SQL (including SQL Server, Azure SQL, etc) services. But this feature is not much use if:

  • You have some service in one of your VNets that needs to make an outbound connection on TCP 25 to something like smtp.office365.com.
  • You need to allow an inbound connection to an FQDN, such as a platform resource that is network-connected using Private Endpoint.

FQDN-Based Network Rules

Network rules allow you to control flows in/out from source to destinations on a particular protocol and source/destination port. Originally this was, and out of the box this is, done using IPv4 addresses/CIDRs. But what if I need to have some network-connected service reach out to smtp.office365.com to send an email? What IP address is that? Well, it’s lots of addresses:

nslookup smtp.office365.com

Non-authoritative answer:
Name: DUB-efz.ms-acdc.office.com
Addresses: 2603:1026:c02:301e::2
2603:1026:c02:2860::2
2603:1026:6:29::2
2603:1026:c02:4010::2
52.97.183.130
40.101.72.162
40.101.72.242
40.101.72.130
Aliases: smtp.office365.com
outlook.office365.com
outlook.ha.office365.com
outlook.ms-acdc.office.com

And that list of addresses probably changes all of the time – so do you want to manage that in your firewall(s) rules and in the code/configuration of your application? It would be better to use the abtsraction provided by the FQDN.

Network Rules allow you to do this now, but you must first enable DNS in the firewall.

Azure Firewall DNS

With this feature enabled, the Azure Firewall can support FQDNs in the Network Rules, opening up the possibility of using any of the supported protocol/port combinations, expanding your name-based rules beyond just HTTP/S and SQL.

By default, the Azure Firewall will use Azure DNS. That’s “OK” for traffic that will only ever be outbound and simple. But life is not normally that simple unless you host a relatively simple point solution behind your firewall. In reality:

  • You might want to let on-premises/remote locations connect to Private Endpoint-enabled PaaS services via site-to-site networking.
  • You might hit an interesting issue, which I will explain in a moment.

Before I move on, for the Private Endpoint scenario:

  1. Configure DNS servers (VMs) on you VNet configuration
  2. Configure conditional forwarders for each Private Endpoint DNS Zone to forward to Azure Private DNS via 168.63.129.16, the virtual public IP address that is used to facilitate a communication channel to Azure platform resources.
  3. Set the Azure Firewall DNS Server settings to point at these DNS servers
  4. Route traffic from your site-to-site gateway(s) to the firewall.

Split DNS Results

If two different machines attempt to resolve smtp.office365.com they will end up with different IP addresses – as you can see in the below diagram.

The result is that the client attempts to connect to smtp.office365.com on IP address A, and the firewall is permitting access on IP address B, so the connection attempt is denied.

DNS Proxy

To overcome this split DNS result (the Firewall and client getting two different resolved IP addresses for the FQDN) we can use DNS Proxy.

The implementation is actually pretty simple:

  1. Your firewall is set up to use your preferred DNS server(s).
  2. You enable DNS Proxy (a simple on/off setting).
  3. You configure your VNet/resources to use the Azure Firewall as their DNS server.

What happens now? Every time your resource attempts to resolve a DNS FQDN, it will send the request to the Azure Firewall. The Azure Firewall proxies/relays the request to your DNS server(s) and waits for the result which is cached. Now when your resource attempts to reach smtp.office365.com, it will use the IP address that the firewall has already cached. Simples!

And yes, this works perfectly well with Active Directory Domain Controllers running as DNS servers in a VNet or spoke VNet as long as your NSG rules allow the firewall to be a DNS client.

Azure Virtual WAN ARM – The Resources

In this post, I will explain the types of resources used in Azure Virtual WAN and the nature of their relationships.

Note, I have not included any content on the recently announced preview of third-party NVAs. I have not seen any materials on this yet to base such a post on and, being honest, I don’t have any use-cases for third-party NVAs.

As you can see – there are quite a few resources involved … and some that you won’t see listed at all because of the “appliance-like” nature of the deployment. I have not included any detail on spokes or “branch offices”, which would require further resources. The below diagram is enough to get a hub operational and connected to on-premises locations and spoke virtual networks.

The Virtual WAN – Microsoft.Network/virtualWans

You need at least one Virtual WAN to be deployed. This is what the hub will connect to, and you can connect many hubs to a common Virtual WAN to get automated any-to-any connectivity across the Microsoft physical WAN.

Surprisingly, the resource is deployed to an Azure region and not as a global resource, such as other global resources such as Traffic Manager or Azure DNS.

The Virtual Hub – Microsoft.Network/virtualHubs

Also known as the hub, the Virtual Hub is deployed once, and once only, per Azure region where you need a hub. This hub replaces the old hub virtual network (plus gateway(s), plus firewall, plus route tables) deployment you might be used to. The hub is deployed as a hidden resource, managed through the Virtual WAN in the Azure Portal or via scripting/ARM.

The hub is associated with the Virtual WAN through a virtualWAN property that references the resource ID of the virtualWans resource.

In a previous post, I referred to a chicken & egg scenario with the virtualHubs resource. The hub has properties that point to the resource IDs of each deployed gateway:

  • vpnGateway: For site-to-site VPN.
  • expressRouteGateway: For ExpressRoute circuit connectivity.
  • p2sVpnGateway: For end-user/device tunnels.

If you choose to deploy a “Secured Virtual Hub” there will also be a property called azureFirewall that will point to the resource ID of an Azure Firewall with the AZFW_Hub SKU.

Note, the restriction of 1 hub per Azure region does introduce a bottleneck. Under the covers of the platform, there is actually a virtual network. The only clue to this network will be in the peering properties of your spoke virtual networks. A single virtual network can have, today, a maximum of 500 spokes. So that means you will have a maximum of 500 spokes per Azure region.

Routing Tables – Microsoft.Network/virtualHubs/hubRouteTables & Microsoft.Network/virtualHubs/routeTables

These are resources that are used in custom routing, a recently announced as GA feature that won’t be live until August 3rd, according to the Azure Portal. The resource control the flows of traffic in your hub and spoke architecture. They are child-resources of the virtualHubs resource so no references of hub resource IDs are required.

Azure Firewall – Microsoft.Network/azureFirewalls

This is an optional resource that is deployed when you want a “Secured Virtual Hub”. Today, this is the only way to put a firewall into the hub, although a new preview program should make it possible for third-parties to join the hub. Alternatively, you can use custom routing to force north-south and east-west traffic through an NVA that is running in a spoke, although that will double peering costs.

The Azure Firewall is deployed with the AZFW_Hub SKU. The firewall is not a hidden resource. To manage the firewall, you must use an Azure Firewall Policy (aka Azure Firewall Manager). The firewall has a property called firewallPolicy that points to the resource ID of a firewallPolicies resource.

Azure Firewall Policy – Microsoft.Network/firewallPolicies

This is a resource that allows you to manage an Azure Firewall, in this case, an AZFW_Hub SKU of Azure Firewall. Although not shown here, you can deploy a parent/child configuration of policies to manage firewall configurations and rules in a global/local way.

VPN Gateway – Microsoft.Network/vpnGateways

This is one of 3 ways (one, two or all three at once) that you can connect on-premises (branch) sites to the hub and your Azure deployment(s). This gateway provides you with site-to-site connectivity using VPN. The VPN Gateway uses a property called virtualHub to point at the resource ID of the associated hub or virtualHubs resource. This is a hidden resource.

Note that the virtualHubs resource must also point at the resource ID of the VPN gateway resource ID using a property called vpnGateway.

ExpressRoute Gateway – Microsoft.Network/expressRouteGateways

This is one of 3 ways (one, two or all three at once) that you can connect on-premises (branch) sites to the hub and your Azure deployment(s). This gateway provides you with site-to-site connectivity using ExpressRoute. The ExpressRoute Gateway uses a property called virtualHub to point at the resource ID of the associated hub or virtualHubs resource. This is a hidden resource.

Note that the virtualHubs resource must also point at the resource ID of the ExpressRoute gateway resource ID using a property called p2sGateway.

Point-to-Site Gateway – Microsoft.Network/p2sVpnGateways

This is one of 3 ways (one, two or all three at once) that you can connect on-premises (branch) sites to the hub and your Azure deployment(s). This gateway provides users/devices with connectivity using VPN tunnels. The Point-to-Site Gateway uses a property called virtualHub to point at the resource ID of the associated hub or virtualHubs resource. This is a hidden resource.

The Point-to-Site Gateway inherits a VPN configuration from a VPN configuration resource based on Microsoft.Network/vpnServerConfigurations, referring to the configuration resource by its resource ID using a property called vpnServerConfiguration.

Note that the virtualHubs resource must also point at the resource ID of the Point-to-Site gateway resource ID using a property called p2sVpnGateway.

VPN Server Configuration – Microsoft.Network/vpnServerConfigurations

This configuration for Point-to-Site VPN gateways can be seen in the Azure WAN and is intended as a shared configuration that is reusable with more than one Point-to-Site VPN Gateway. To be honest, I can see myself using it as a per-region configuration because of some values like DNS servers and RADIUS servers that will probably be placed per-region for performance and resilience reasons. This is a hidden resource.

The following resources were added on 22nd July 2020:

VPN Sites – Microsoft.Network/vpnSites

This resource has a similar purpose to a Local Network Gateway for site-to-site VPN connections; it describes the on-premises location, AKA “branch office”.  A VPN site can be associated with one or many hubs, so it is actually connected to the Virtual WAN resource ID using a property called virtualWan. This is a hidden resource.

An array property called vpnSiteLinks describes possible connections to on-premises firewall devices.

VPN Connections – Microsoft.Network/vpnGateways/vpnConnections

A VPN Connections resource associates a VPN Gateway with the on-premises location that is described by an associated VPN Site. The vpnConnections resource is a child resource of vpnGateways, so there is no actual resource; the vpnConnections resource takes its name from the parent VPN Gateway, and the resource ID is an extension of the parent VPN Gateway resource ID.

By necessity, there is some complexity with this resource type. The remoteVpnSite property links the vpnConnections resource with the resource ID of a VPN Site resource. An array property, called vpnSiteLinkConnections, is used to connect the gateway to the on-premises location using 1 or 2 connections, each linking from vpnSiteLinkConnections to the resource/property ID of 1 or 2 vpnSiteLinks properties in the VPN Site. With one site link connection, you have a single VPN tunnel to the on-premises location. With 2 link connections, the VPN Gateway will take advantage of its active/active configuration to set up resilient tunnels to the on-premises location.

Virtual Network Connections – Microsoft.Network/virtualHubs/hubVirtualNetworkConnections

The purpose of a hub is to share resources with spoke virtual networks. In the case of the Virtual Hub, those resources are gateways, and maybe a firewall in the case of Secured Virtual Hub. As with a normal VNet-based hub & spoke, VNet peering is used. However, the way that VNet peering is used changes with the Virtual Hub; the deployment is done using the hub/VirtualNetworkConnections child resource, whose parent is the Virtual Hub. Therefore, the name and resource ID are based on the name and resource ID of the Virtual Hub resource.

The deployment is rather simple; you create a Virtual Network Connection in the hub specifying the resource ID of the spoke virtual network, using a property called remoteVirtualNetwork. The underlying resource provider will initiate both sides of the peering connection on your behalf – there is no deployment required in the spoke virtual network resource. The Virtual Network Connection will reference the Hub Route Tables in the hub to configure route association and propagation.

More Resources

There are more resources that I’ve yet to document, including: