Microsoft Ignite 2019 – Deep Dive on Azure Governance

Observe and Identify Gaps

  • Regulatory compliance requirements
  • News, blogs, industry expectations
  • Bet practice guidelines
  • Internal teams’ recommendation
  • Built-in policies and GitHub policies
  • And so on

Authoring Custom Policy

Can I use policy for this?

  • Resource configurations
  • Azure resources and (selectively) objects within the resource
  • Auto-generation of aliases – Aliases abstract API versions.
  • Resource type for compliance state

Resource Property Alias

  • 95% coverage for all resource properties.
  • If there is a swagger API then there should be an alias
  • If not – open a support case

Authoring a Custom Policy

4 basic steps:

  • Determine resource properties
  • Find alias
    • The ese first two in VS Code extension
  • And 3 other steps 😊

Browse resources in VS Code. Find the property alias. Copy/paste into new policy definition.

Test The Policy

Enforcement:

  • PUT & PATCH

Compliance Assessment

  • Property is compliance, is non-compliance, or doesn’t exist

Enforcement mode setting (recently introduced):

  • Quick what-if testing (coming, January I think) test the result before you roll out the remediation.

Policy-as-code Demo

Shows a released DevOps pipeline.

  1. Create Initiative
  2. Create Initiative
  3. Test Assignment
  4. Deploy (Enforcement Mode set to enabled)

https://aka.ms/policyscripts

Assess Compliance

  • Azure Portal compliance experience
  • Policy Insights API for summary and raw data
  • Export compliance data (coming), e.g. Power BI – they are doing usability studies at Ignite this week.

Road Ahead For Azure Policy

  • Regulatory compliance
  • Multi-tenancy support with Azure Lighthouse
  • Authoring and language improvement
  • And more

Policy for Objects within a Resource

Announcing Key Vault preview. Demo shows ability to control child objects in the Key Vault resource.

And something for AKS engine – slide moved too quick. Demo shows assessment of pods inside an AKS cluster. Enables control of source images. Trying to deploy an unauthorised image to a pod fails because of the policy.

Organizing Resources with Resource Graph

At scale:

  • Management Group: hierarchy. Define hierarchical organization
  • Tag: Metadata. Apply tags as metadata to logically organize resources into a taxonomy
  • Resource graph: Visibility. Query, explore, and analyse cloud resources at scale

Why Resource Graph

Scale. A query of large number of resource will require a complex query via ARM. That query fans out to resource providers and it just doesn’t scale because of performance – available capacity and quota limits.

Resource Graph sends the query to ARM which then makes ONE call to the ARG. ARG is like a big cache of all your resources. Any time that there is a change, that change is notified to ARG very quickly.

ARG – What’s New

Resource Group/Subscription Support

  • Stored in ResourceContainers table
    • Resources/subscriptions
    • Resources/subscriptions/resourcegroups
  • Resources is default table for all existing resources

Join Support

Supported flavours:

  • Leftouter
  • Innter
  • Innerunique

New operators:

  • Union
  • mvexpand – expand an array/collection property

Support For Shared Queries

Save the queries into Graph Explorer.

Save query:

  • Priavete query
  • Shared (Microsoft.resourcegraph/queries ARM resource)
    • Saved to RG
    • Subject to RBAC

Road Ahead For ARG

  • Support for management groups
  • Support for more dimensions
  • Support for more resource properties, e.g. VM power state

Visibility To Resource Changes

Change History went into public preview earlier this year. Build on resource graph – already constantly informed about changes to resources. They take snapshots, identify the differences, and report on those changes. This is available in all regions and is free because it’s built on already existing functionality in ARG.

What’s New

  • Support for create/delete changes
  • Support for change types
  • Support for property breakdown
  • Support for change category

Road Ahead

  • At scale – ability to query across resource containers
  • Notifications – subscribe to notifications on resources
  • Correlating “who” – Ability to correlate a change with the user or ID that performed the call

Microsoft Ignite 2019  – What’s New In Azure Networking?

Speaker: Yousef Khalidi, CVP Microsoft Azure Networking

Numbers

  • 6 Pbs of capacity in a single region.
  • 30 billion packets/second on the Azure WAN
  • ExpressRoute up to 100 Gbps per circuit
  • 160+ edge locations in addition to the 54 regions bringing the Azure WAN entry points closer to you
  • FPGA hardware provide jitter free networking

Satellite Connectivity

ExpressRoute now supports satellites. Handy for remote or mobile locations, ships, planes, remote mines, oil rigs, etc.

Edge Site

External: customer

Internal: Azure WAN

Features:

  • WAN
  • Azure ExpressRoute POP
  • Front Door, CDN, etc (global services)

Functions of Azure Networks

  • Connect & extend
  • Protect
  • Deliver
  • Monitor

Azure Peering Service Preview

Business quality connectivity to Microsoft clouds.

Connectivity Partners:

  • Local and geo peering tech
  • High capacity peers
  • Optimize Internet traffic routing

A bunch of launch connectivity partners. Looking for more carriers to join.

Azure Virtual WAN

“Completing the screnario”.

GA:

  • ExpressRoute
  • Point to site VPN
  • Path selection from branch

Preview:

  • Hub/any-to-any connectivity – use vWAN as your Internet access point from on-prem.
  • Azure Firewall integration

Cisco SD-WAN partnership with Azure WAN and Office 365.

ExpressRoute

GA:

  • Fast Path
  • ExpressRoute Local – no egress charges
  • Continued expansion of ER locations

Preview:

MACsec encryption:

  • Secures physical links at ExpressRoute sites
  • Bring-your-own-key, store keys in Azure Key Vault
  • Available on ER Direct

ExpressRoute for Satellites

GA.

  • Direct private access to Azure.
  • Connect to Azure from anywhere.
  • 3 partners today: Viasat, SES, Intelsat.

From customer point of view, it looks like normal ExpressRoute.

VPN

High throughput VPN: 10 Gbps GA

  • New gateway SKUs
  • Up to 10 Gbps aggregate
  • Up to 10,000 P2S connections
  • Ikev1 + IKEv2 on VpnGw1-5 GA

VPN Gateway packet capture Preview

Custom IKE traffic scenarios (coming soon)

IPv6

  • Dual stacked for max flexibility.
  • Native IPv6 all the way to the VMs.
  • Private IPv6 addresses for VMs and NICs.

Zero-Trust Networking

A journey with Azure Networking featuring:

  • Azure Firewall
  • WAF
  • Azure Private Link
  • Azure DDos Protection

Private Link Preview

  • Goal is to enable all PaaS services.
  • Built-in data exfiltration protection.
  • Predictable IP for addressing PaaS services.

Azure Firewall Manager

Preview

  • Central deployment and configuration
    • Multiple firewall instances
    • Optimized for devops with hierarchical policies
  • Automated routing
  • Advanced security with 3rd party SECaaS

Roadmap:

  • Virtual network support, split routing

Partnerships to route traffic via Azure WAN to the Internet:

  • zScaler
  • iBoss
  • CheckPoint coming soon

You route from on-prem via Azure WAN, then to partner service to Internet. However, Office 365 should go directly – MS automatically does that.

Azure Bastion is GA

  • RDP/SSH from Azure Portal without NAT rules.
  • No public IPs required.
  • Supports VMs, VMSS, DevTest Labs.

IMO, still not ready for consumption without local SSH/RDP client support.

Azure WAF

Preview:

  • Microsoft Threat Intelligence
    • Protect apps against automated attacjs.
    • Managed good/bad bots with Azure BotManager rule set
  • Site and UDI path specific WAF policies
    • Customise WAF policies at retional WAF for finer grained protection at each host/listener or URL path level
  • Geo-filtering on regional WAF
    • Enhanced custom rule matching criterion includes filtering by country.

Application Gateway

GA

  • Integration with AKS as ingress controller
  • Azure Key Vault integration
  • Enhanced metrics

Coming soon:

  • Wildcard listener
    • No need to create a listener for each domain

Azure Front Door

GA

  • Single or multi-region app and API acceleration
    • Improve HTTP performance and reduce page load times.
  • Load balancing at the edge and fast-failover
    • Build always-on application experiences that fail-fast (safely)
  • Integrated SSL, WAF and DDoS

Azure CDN

GA:

  • Reduced Azure egress pricing
    • Egress is free from storage, compute, media services to Azure CDN from Microsoft.

Preview

  • Easy to use and highly customizable rules engine
    • Few click onboard
    • Use rules engine to customise CDN.

Internet Analyzer Preview

Easily measure and compare end user experience for your application.

  • Cloud migration
  • CDN and app acceleration
  • Perform A/B measurements

Azure Monitor

GA

  • Traffic Analytics – accelerated processing from hours to minutes.
  • Enhanced troubleshooting.

Preview

  • Network Insights – single health console for the entire cloud network

Multi-Edge Edge Compute Demo

There’s an Azure Edge box on stage. It has a SIM and connects via a private LTE connection (MEC). A robot is controlled via the edge box. This is a tech preview at the moment.

BGP with Microsoft Azure Virtual Networks & Firewalls

In this article, I want to explain how important BGP is in Azure networking, even if you do not actually use BGP for routing, and the major role it plays in hub-and-spoke architectures and deployments with a firewall.

What is BGP?

I was never the network guy in an on-premises deployment. Those 3 letters, BGP, were something someone else worried about. But in Azure, the server admin becomes a network admin. Most of my work in Azure is networking now. And that means that the Border Gateway Protocol (BGP) is important to me now.

BGP is a means of propagating routes around a network. It’s a form of advertising or propagation that spreads routes to one or more destinations one hop at a time. If you think about it, BGP is like word-of-mouth.

A network, Subnet A, is a destination. Subnet A advertises a route to itself to a neighbour network, Subnet B. Subnet B advertises to its neighbours, including Subnet C, that it knows how to get to the original subnet, Subnet A. And the propagation continues. A subnet at the far end of the LAN/WAN, Subnet D, knows that there is another subnet far away called Subnet A and that the path to Subnet A is back via the propagating neighbour, Subnet C. Subnet C will then forward the traffic to Subnet B, which in turn sends the traffic to the destination subnet, Subnet A.

Azure and BGP

Whether you use BGP in your on-premises network or not, there will be a pretty high percentage chance that you will use BGP in Azure virtual networking – we’ll get to that in a few moments.

If you create a site-to-site VPN connection, you have the option to integrate your on-premises BGP routing with your Azure virtual network(s). If you use ExpressRoute, you must use BGP. In both cases, BGP routes are propagated from on-premises, informing your Azure virtual network gateway of all the on-premises networks that it can route to over that connection.

But BGP Is Used Without BGP

Let’s say that you are deploying a site-to-site VPN connection to Azure and that you do not use BGP in your configuration. Instead, you create a Local Network Gateway in Azure to define your on-premises networks. The virtual network gateway will load those networks from the Local Network Gateway and know to route across the associated VPN tunnel to get to those destinations.

And here’s where things get interesting. Those routes must get advertised around the virtual network.

If a virtual machine in the virtual network needs to talk to on-premises, it needs to know that the route to that on-premises subnet is via the VNet Gateway in the gateway subnet. So, the route gets propagated out from the gateway subnet.

Let’s scale that situation out a bit to a hub & spoke architecture. We have a site-to-site connection with or without BGP being used. The routes to on-premises are in the VNet Gateway and are propagated out to the subnets in the hub VNet that contains the VNet Gateway. And in turn, the routes are advertised to peered virtual networks (spokes) and their subnets. Now a resource on a subnet in a spoke virtual network has a route to an on-premises virtual network – across the peering connection and to the virtual network gateway.

Note: in this scenario, the hub is sharing the VNet gateway via peering, and the spoke is configured in peering to use the remote VNet gateway.

Bi-Directional

Routing is always a 2-way street. If routes only went one way, then a client could talk to a server, but the server would not be able to talk to the client.

If we have BGP enabled VPN or ExpressRoute, then Azure will propagate routes for the spoke subnets back down through peering and to the VNet Gateway. The VNet Gateway will then propagate those routes back to on-premises.

If you do not have BGP VPN (you are statically setting up on-premises routes in the Local Network Gateway) then you will have to add the address space of each spoke subnet to the on-premises VPN appliance(s) so that they know to route via the tunnel to get to the spokes. The simple way to do that is to plan your Azure networking in advance and have a single supernet (a /16, for example) instead of a long list of smaller subnets (/24s, for example) to configure.

Control & Security

Let’s say that you want to add a firewall to your hub. You want to use this firewall to isolate everything outside of Azure from your hub and spoke architecture, including the on-premises networks. You’ve done some research and found that you need to add a route table and a user-defined route to your hub and spoke subnets, instructing them that the route to on-premises is through the VNet Gateway.

Now you need to do some reading – you need to learn (1) how Azure routing really works (not how you think it works) and (2) how to troubleshoot Azure routing. FYI, I’ve been living in this world non-stop for the last 10 months.

What you will probably have done is configured your spokes with a route to 0.0.0.0/0 via the internal/backend IP address of the firewall. You are assuming that will send all traffic to anywhere via the Firewall. Under the covers, though, routes to on-premises are still propagating from the VNet Gateway to all the subnets in your hub and spoke architecture. If on-premises was 192.168.1.0/24 and your spoke machine wanted to route to on-premises, then the Azure network fabric will compare the destination with the routes that it has in a hidden route table – the only place you can see this is in Effective Routes in a VM NIC Azure resource. You have a UDR for 0.0.0.0/0 via the firewall. That’s a 0-bit match for any destinations in 192.168.1.0/24. If that was the only route in the subnet, then that route would be taken. But we are sending a packet to 192.168.1.x and that is a 24-bit match with the propagated route to 192.1681.0/24. And that’s why the response from the spoke resource will bypass the firewall and go straight to the VNet Gateway (via peering) to get to on-premises. That is not what you expected or wanted!

Note: the eagle-eyed person that understands routing will know that there will be other routes in the subnet, but they are irrelevant in this case and will confuse the explanation.

The following works even if you do not use BGP with a site-to-site VPN!

To solve this problem, we can stop propagation – we can edit the route table resources in the internal Azure subnets (or pre-do this in JSON) and disable BGP route propagation. The result of this is that the routes that the VNet Gateway were pushing out to other subnets will stop being propagated. Now if we viewed the effective routes for a spoke subnet, we’d only see a route to the firewall and the firewall is now responsible for forwarding traffic to on-premises networks to the VNet Gateway.

It is important to understand that this disabling of propagation affects the propagation only in 1 direction. Routes from the VNet Gateway will not be propagated to subnets with propagation disabled. However, ALL subnets will still propagate routes to themselves back to the VNet Gateway – we need on-premises to know that the route to these Azure subnets is still via the Gateway.

More work will be required to get the Gateway Subnet to route via the firewall, but that’s a whole other topic! We’re sticking to BGP and propagation here.

The Firewall and BGP Propagation

Let’s make a mistake, shall we? It will be useful to get a better understanding of the features. We shall add a route table to the firewall subnet and disable BGP route propagation. Now the resource in the spoke subnet wants to send something to an on-premises network. The local subnet route table instructs it to send all traffic to external destinations (0.0.0.0/0) via the firewall. The packets hit the firewall. The firewall tries to send that traffic out and … it has only one route (a simplification) which is to send 0.0.0.0/0 to Internet.

By disabling BGP propagation on the firewall subnet, the firewall no longer knows that the route to on-premises networks is via the virtual network gateway. This is one of those scenarios where people claim that their firewall isn’t logging traffic or flows – in reality, the traffic is bypassing the firewall because they haven’t managed their routing.

The firewall must know that the on-premises networks (a) exist and (b) are routes to via the VNet Gateway. Therefore, BGP propagation must be left enabled on the firewall subnet (the frontend one, if you have a split frontend/backend firewall subnet design).

Not Just Firewalls!

I’m not covering it here, but there are architectures where there might be other subnets that must bypass the firewall to get back to on-premises. In those cases, those subnets must also have BGP propagation left enabled – they must know that the on-premises networks exist and that they should route via the VNet Gateway.

Network Tunnel

Private Connections to Azure PaaS Services

In this post, I’d like to explain a few options you have to get secure/private connections to Azure’s platform-as-a-service offerings.

Express Route – Microsoft Peering

 

ExpressRoute comes in a few forms, but at a basic level, it’s a “WAN” connection to Azure virtual networks via one or more virtual network gateways; Customers this private peering to connect on-premises networks to Azure virtual networks over an SLA-protected private circuit. However, there is another form of peering that you can do over an ExpressRoute circuit called Microsoft peering. This is where you can use your private circuit to connect to Microsoft cloud services that are normally connected to over the public Internet. What you get:

  • Private access to PaaS services from your on-premises networks.
  • Access to an entire service, such as Azure SQL.
  • A wide array of Azure and non-Azure Microsoft cloud services.

FYI, Office 365 is often mentioned here. In theory, you can access Office 365 over Microsoft peering/ExpressRoute. However, the Office 365 group must first grant you permission to do this – the last I checked, you had to have legal proof of a regulatory need for private access to Cloud services. 

Service Endpoint

Imagine that you are running some resources in Azure, such as virtual machines or App Service Environment (ASE); these are virtual network integrated services. Now consider that these services might need to connect to other services such as storage accounts, Azure SQL, or others. Normally, when a VNet connected resource is communicating with, say, Azure SQL, the packets will be routed to “Internet” via the 0.0.0.0/0 default route for the subnet – “Internet” is everywhere outside the virtual network, not necessarily The Internet. The flow will hit the “public” Azure backbone and route to the Azure SQL compute cluster. There are two things about that flow:

  • It is indirect and introduces latency.
  • It traverses a shared network space.
  • A growing number of Azure-only services that support service endpoints.

A growing number of services, including storage accounts, Azure SQL, Cosmos DB, and Key Vault, all have services endpoints available to them. You can enable a service endpoint anywhere in the route from the VM (or whatever) to “Internet” and the packets will “drop” through the service endpoint to the required Azure service – make sure that any firewall in the service accepts packets from the private subnet IP address of the source (VM or whatever). Now you have a more direct and more private connection to the platform service in Azure from your VNet. What you get:

  • Private access to PaaS services from your Azure virtual networks.
  • Access to an entire service, such as Azure SQL, but you can limit this to a region.

Service Endpoint Trick #1

Did you notice in the previous section on service endpoints that I said:

You can enable a service endpoint anywhere in the route from the VM (or whatever) to “Internet”

Imagine you have a complex network and not everyone enables service endpoints the way that they should. But you manage the firewall, the public IPs, and the routing. Well, my friend, you can force traffic to support Azure platform services via service endpoints. If you have a firewall, then your routes to “Internet” should direct outbound traffic through the firewall. In the firewall (frontend) subnet, you can enable all the Azure service endpoints. Now when packets egress the firewall, they will “drop” through the service endpoints and to the desired Azure platform service, without ever reaching “Internet”.

Service Endpoint Trick #2

You might know that I like Azure Firewall. Here’s a trick that the Azure networking teams shared with me – it’s similar to the above one but is for on-premises clients trying to access Azure platform services.

You’ve got a VPN connection to a complex virtual network architecture in Azure. And at the frontend of this architecture is Azure Firewall, sitting in the AzureFirewallSubnet; in this subnet you enabled all the available service endpoints. Let’s say that someone wants to connect to Azure SQL using Power BI on their on-premises desktop. Normally that traffic will go over the Internet. What you can do is configure name resolution on your network (or PC) for the database to point at the private IP address of the Azure Firewall. Now Power BI will forward traffic to Azure Firewall, which will relay you to Azure SQL via the service endpoint. What you get:

  • Private access to PaaS services from your on-premises or Azure networks.
  • Access to individual instances of a service, such as an Azure SQL server
  • A growing number of Azure-only services that support service endpoints.

Private Link

In this post, I’m focusing on only one of the 3 current scenarios for Private Link, which is currently in unsupported preview in limited US regions only, for limited platform services – in other words, it’s early days.

This approach aims to give a similar solution to the above “Service Endpoint Trick #2” without the use of trickery. You can connect an instance of an Azure platform service to a virtual network using Private Link. That instance will now have a private IP address on the VNet subnet, making it fully routable on your virtual network. The private link gets a globally unique record in the Microsoft-managed privatelink.database.windows.net DNS zone. For example, your Azure SQL Server would now be resolvable to the private IP address of the private link as yourazuresqlsvr.privatelink.database.windows.net. Now your clients, be the in Azure or on-premises, can connect to this DNS name/IP address to connect to this Azure SQL instance. What you get:

  • Private access to PaaS services from your on-premises or Azure networks.
  • Access to individual instances of a service, such as an Azure SQL server.
  • (PREVIEW LIMITATIONS) A limited number of platform services in limited US-only regions.

Creating an Azure Service for Slow Moving Organisations

In this post, I will explain how you can use Azure’s Public IP Prefix feature to pre-create public IP addresses to access Azure services when you are working big/government organisations that can take weeks to configure a VPN tunnel, outbound firewall rule, and so on.

In this scenario, I need a predictable IP address so that means I must use the Standard SKU address tier.

The Problem

It normally only takes a few minutes to create a firewall rule, a VPN tunnel, etc in an on-premises network. But sometimes it seems to take forever! I’ve been in that situation – you’ve set up an environment for the customer to work with, but their on-premises networking team(s) are slow to do anything. And you only wish that you had given them all the details that they needed earlier in the project so their configuration work would end when your weeks of engineering was wrapping up.

But you won’t know the public IP address until you create it. And that is normally only created when you create the virtual network gateway, Azure Firewall, Application Firewall, etc. But what if you had a pool of Azure public IP addresses that were pre-reserved and ready to share with the network team. Maybe they could be used to make early requests for VPN tunnels, firewall rules, and so on? Luckily, we can do that!

Public IP Prefix

An Azure Public IP Prefix is a set of reserved public IP addresses (PIPs). You can create an IP Prefix of a certain size, from /31 (2 addresses) to /24 (256 addresses), in a certain region. The pool of addresses is a contiguous block of predictable addresses. And from that pool, you can create public IP addresses for your Azure resources.

In my example, I want a Standard tier IP address and this requires a Standard tier Public IP Prefix. Unfortunately, the Azure Portal doesn’t allow for this with Public IP Prefix, so we need some PowerShell. First, we’ll define some reused variables:

Now we will create the Publix IP Prefix. Note that the length refers to the subnet mask length. In my example that’s a /30 resulting in a prefix with 4 reserved public IP addresses:

You’ll note above that I used Standard in the command. This creates a pool of static Standard tier public IP addresses. I could have dropped the Standard, and that would have created a pool of static Basic tier IP addresses – you can use the Azure Portal to deploy Basic tier Public IP Prefix and public IP addresses from that prefix. The decision to use Standard tier or Basic tier affects what resources I can deploy with the addresses:

  • Standard: Azure Firewall, zone-redundant virtual network gateways, v2 application gateways/firewalls, standard tier load balancers, etc.
  • Basic static: Basic tier load balancers, v1 application gateways/firewalls, etc.

Note that the non-zone redundant virtual network gateways cannot use static public IP addresses and therefore cannot use Public IP Prefix.

Creating a Public IP Address

Let’s say that I have a project coming up where I need to deploy an Application Firewall and I know the on-premises network team will take weeks to allow outbound access to my new web service. Instead of waiting until I build the application, I can reserve the IP address now, tell the on-premises firewall team to allow it, and then work on my project. Hopefully, by the time I have the site up and running and presented to the Internet by my Application Firewall, they will have created the outbound firewall rule from the company network.

Browse to the Public IP Prefix and make sure that it is in the same region as the new virtual network and virtual network gateway. Open the prefix and check Allocated IP Addresses in the Overview. Make sure that there is free capacity in the reserved block.

Now I can continue to use my variables from above and create a new public IP address from one of the reserved addresses in the Public IP Prefix:

Use the Public IP Address

I now have everything I need to pass onto the on-premises network team in my request. In my example, I am going to create a v2 Application Firewall.

Once I configure the WAF, the on-premises firewall will (hopefully) already have the rule to allow outbound connections to my pre-reserved IP address and, therefore, my new web service.

My Hyper-V Lab Is Recycled And What That Means

It’s a sad day as I say goodbye to a phase of my career. Long-time readers of my writings, attendees of my presentations, or followers of my social media, will know that I spent many years discussing and promoting Windows Server Hyper-V and related tech. This morning, I said goodbye to that.

Back in 2007, I was working as a senior engineer for one of the largest hosting companies in Ireland. I was building a new VMware platform for that company, and a local Microsoft evangelist targeted me to promote Windows Server 2008 and Hyper-V. I was interested in the Windows Server content, but the timing sucked for Hyper-V because it would be another year before it was going to be released and I needed to get the virtualization platform into production ASAP. I ended up changing jobs the following year to work with a start-up hosting company and then I saw the opportunity for Hyper-V. And that started the ball rolling on all the Hyper-V content on this site that would eventually host over the years.

My first presentation on Hyper-V, at E2EVC (PubForum back then) in Dublin, was a bloodbath. The room was filled with VMware nuts, including one consultant that I had previously worked with who worked for the biggest sceptic company of Microsoft on the island. My 30 minute-long session was an hour of me being attacked. But eventually I learned to turn that around.

Eventually I was awarded as a Hyper-V MVP. I can remember my first year in Bellevue/Redmond, having serious jetlag at the MVP Summit. One morning I woke up in the early AMs and wrote a proposal for a new Hyper-V book. One year later, Mastering Hyper-V Deployment (co-authored with Patrick Lownds) was released … written using a Dell Latitude laptop and an eSATA disk.

A few years later, the same process started again, and Windows Server 2012 Hyper-V (co-authored with Patrick Lownds, Hans Vredevoort, and Damian Flynn) was released. I needed a much better lab for this book. I had an old PC that I used as a domain controller and iSCSI target, and I bought two HP tower PCs as Hyper-V hosts with extra RAM, storage, disks, NICs, and a 1 GbE switch – I also had access to  a 10 GbE RDMA rig at my (then) employer. A year later … the book was out. I was flattered when some of the Hyper-V team shared some nice comments both privately and on Amazon.com … the Hyper-V team bought copies of the book to educate new-hires!

This blog got lots of regular updates and lots of traffic. Content included System Center, failover clustering, Storage Spaces, Windows networking, etc … all related to Hyper-V. And then things started to change.

I was working in the small/medium enterprise market – the perfect early adopters for Hyper-V. Microsoft’s ambitions for Hyper-V started to change pre-2012. Publicly, they wanted to focus on Fortune 500 customers. Secretly, Microsoft was looking at their future in IT. Increasingly, the new stuff in Hyper-V was focused on customers that I had no dealings with. And then along came Windows Server 2019 … there was so little new for Hyper-V in there that I wondered if virtualization had plateaued. Support for flash storage on the motherboard, etc, are edge things. Getting 16,000,000 IOPS is like saying I have a car with 1,000 BHP – how many people really care? Virtualization had become a commodity, a thing under the covers, that supports other features. These days, you only hear “Hyper-V” when the media are wondering how new dev/security features are being enabled in Windows 10 preview builds.

And several years ago, my job changed. Windows Server, Hyper-V, and System Center was my job. But then Microsoft came in to meet my boss and me and told us “we need you to switch your focus to Azure”. And that was that – I learned Azure from scratch, and today I’m working with large customers on Azure projects. I sometimes joke with people that I get to work with the largest Azure clusters in the world (this is true – Azure was part of Microsoft’s change of focus for Hyper-V) but I don’t worry about the host stuff; instead I architect and deploy what sits on top of the hypervisor, the part that the business/customer actually cares about these days. I don’t care about NICs, BIOS,  drivers, and physical disks or worry about flashing lights in a data centre/basement – I design & deploy services.

If you’ve been reading my content here or on Petri.com then you would have noticed the change. Part of the topic change was that there was less content in the Hyper-V world, less content that I could work with, and I was spending 100% of my time in The (real) Cloud.

And that leads me to what happened this morning. As I’ve posted before, I work from home now and part of that was setting up my home office. Two items became redundant; the pair of HP Tower PCs that I had set up as a budget Hyper-V cluster that I used as one of my labs for the “Windows Server 2012 Hyper-V” book. I had powered off those machines quite a while ago and they were stacked to the side of my desk, taking up space and gathering dust. This morning, I opened those machines up one last time, removed the OS disk, and put the machines outside to be collected by a recycling company.

A few hours ago, the van came around and collected the machines. They are gone. And that is the end of a significant era in my career – it helped my profile, I learned a lot, I made friends across the world, and it led to a job where I met my wife and started my family. So writing this now, I realise that associating sentimentality to a couple of black tower PCs is stupid, but they played a major role in my life. Their departure marks the end of the Hyper-V part of my career, and I’m well and truly into the subsequent phase, and even considering what’s next after that.

The Secret Sauce That Devs Don’t Want IT Pros to Know About

Honesty time: that title is a bit click-baitish, but the dev community is using a tool that most IT pros don’t know much/anything about, and it can be a real game changer, especially if you write scripts or work with deployment solutions such as Azure Resource Manager (ARM) JSON templates.

Shot Time

As soon as I say “DevOps” you’re already reaching for that X on the top right or saying “Oh, he’s lost it … again”. But that’s what I’m going to talk about: DevOps. Or to be more precise, Azure DevOps.

Methodology

For years, when I’ve thought about DevOps, I’ve thought “buggy software with more frequent releases”. And that certainly can be the case. But DevOps is born out of the realisation that how we have engineered software (or planned almost anything, to be honest) for the past few decades has not been ideal. Typically, we have some start-middle-end waterfall approach that assumes we know the end state. If this is a big (budget) or long (time) project, then getting half way to that planned end-state and realising that the preconception was wrong is very expensive – it leads to headlines! DevOps is a way of saying:

  • We don’t know the end-state
  • We’re going to work on smaller scenarios
  • We will evolve what we create based on those scenarios
  • The more we create, the more we’ll learn about what the end state will be
  • There will be larger milestones, which will be our releases

This is where the project management gurus will come out and say “this is Scrum” or some other codswallop that I do not care about; that’s the minutia for the project management nerds to worry about.

Devs started leaning this direction years ago. It’s not something that is always the case – most devs that I encountered in the past didn’t use some of the platform tools for DevOps such as GitHub or Azure DevOps (previously Teams). But here’s an interesting thing: some businesses are adopting the concepts of “DevOps” for building a business, even if IT isn’t involved, because they realised that some business problems are very like tech problems: big, complex, potentially expensive, and with an unknown end-state.

Why I Started

I got curious about Azure DevOps last year when my friend, Damian Flynn, was talking about it at events. Like me, Damian is an Azure MVP/IT Pro but, unlike me, he stayed in touch with development after college. I tried googling and reading Microsoft Docs, but the content was written in that nasty circular way that Technet used to be – there was no entry point for a non-dev that I could find.

And then I changed jobs … to work with Damian as it happens. We’ve been working on a product together for the last 7 months. And on day 1, he introduced me to DevOps. I’ll be honest, I was lost at first, but after a few attempts and then actually working with it, I have gotten to grips with it and it gives me a structured way to work, plan, and collaborate on a product that will never have and end-state.

What I’m Working On

I’m not going to tell you exactly what I’m working on but it is Azure and it is IT Pro with no dev stuff … from me, at least. Everything I’ve written or adjusted is PowerShell or Azure JSON. I can work with Damian (who is 2+ hours away by car) on Teams on the same files:

  • Changes are planned as features or tasks in Azure DevOps Boards.
  • Code is stored in an Azure DevOps repo (repository).
  • Major versions are built as branches (changes) of a master copy.
  • Changes to the master copy are peer reviewed when you try to merge a branch.
  • Repos are synchronized to our PCs using Git.
  • VS Code is our JSON and PowerShell editor.

It might all sound complex … but it really was pretty simple to set up. Now behind the scenes, there is some crazy-mad release “pipelines” stuff that Damian built, and that is far from simple, but not mandatory – don’t tell Damian that I said that!

Confusing Terminology

Azure DevOps inherits terminology from other sources, such as Git. And that is fine for devs in that space, but some of it made me scratch my head because it sounded “the wrong way around”. Here’s some of the terms:

  • Repo: A repository is where you store code.
  • Project: A project might have 1 or more repos. Each repo might be for a different product in that project.
  • Boards: A board is where you do the planning. You can create epics, tasks and issues. Typically, an Epic is a major part of a solution, a task is what you need to do to make that work, and an issue is a bug to be fixed.
  • Sprint: In managed projects, sprints are a predefined period of time that you assign people to. Tasks are pulled into the sprint and assigned to people (or pulled by people to themselves) who have available time and suitable skills.
  • Branch: You always have one branch called the master or trunk. This is the “master copy” of the code. Branches can be made from the master. For example, if I have a task, I might create a branch from the master in VS Code to work on that task. Once I am complete, I will sync/push that branch back up to Azure DevOps.
  • Pull Request: This is the one that wrecked my head for years. A pull request is when you want to take your changes that are stored in a branch and push it back into the parent branch. From Git’s or DevOps’ point of view, this is a pull, not a push. So you create a pull request for (a) identify the tasks you did, get someone to review/approve your changes, merge the branch (changes) back into the parent branch.
  • Nested branch: You can create branches from branches. Master is typically pretty locked down. A number of people might want a more flexible space to work in so they create a branch of master, maybe for a new version – let’s call this the second level branch. Each person then creates their own third level branches of the first branch. Now each person can work away and do pull requests into the more flexible second-level branch. And when they are done with that major piece of work, they can do a pull request to merge the second-level back into the master or trunk.
  • Release: Is what it sounds like – the “code” is ready for production, in the opinion of the creators.

Getting Started

The first two tools that you need are free:

  • Git command line client – you do not need a GitHub account.
  • Visual Studio Code

And then you need Azure DevOps. That’s where the free pretty much stops and you need to acquire either per-user/plan licensing or get it via MSDN/Visual Studio licensing.

Opinion

I came into this pretty open minded. Damian’s a smart guy and I had long conversations with one of our managers about these kinds of methodologies after he attended Scrum master training.

Some of the stuff in DevOps is nasty. The terminology doesn’t help, but I hope the above helps. Pipelines is still a mystery to me. Microsoft shared a doc to show how to integrate a JSON release via Pipelines and it’s a big ol’ mess of things to be done. I’ll be honest … I don’t go near that stuff.

I don’t think that me and Damian could have collaborated the way we have without DevOps. We’ve written many thousands of lines of code, planned tasks, fought bugs. It’s been done without a project manager – we discuss/record ideas, prioritize them, and then pull (assign to ourselves) the tasks when we have time. At times, we have worked in the same spaces and been able to work as one. And importantly, when it comes to pull requests, we peer review. The methodology has allowed other colleagues to participate and we’re already looking at how we can grow that more in the organization to bring in more skills/experience into the product. Without (Azure) DevOps we could not have done that … certainly storing code on some file storage in the cloud would have been a total disaster and lacked the structure that we have had.

What Impact on You Will AMD EPYC Processors Have?

Microsoft has announced new HB-V2, Das_v3, and Eas_v3 virtual machines based on hosts with AMD EPYC processors. What does this mean to you and when should you use these machines instead of the Intel Xeon alternatives?

A is for AMD

The nomenclature for Azure virtual machines is large. It can be confusing for those unfamiliar with the meanings. When I discussed the A-Series, the oldest of the virtual machine series, I would tell people “A is the start of the alphabet” and discuss these low power machines. The A-Series was originally hosted on physical machines with AMD Opteron processors, a CPU that had lots of cores and required little electricity when compared to the Intel Xeon competition. These days, an A-Series might actually be hosted on hosts with Intel CPUs, but each virtual processor is throttled to offer similar performance to the older hosts.

Microsoft has added the AMD EPYC 7002 family of processors to their range of hosts, powering new machines:

  • HB_v2: A high performance compute machine with high bandwidth between the CPU and RAM.
  • Das_v3 (and Da_v3): A new variation on the Ds_v3 that offers fast disk performance that is great for database virtual
  • Eas_v3 (and Ea_v3): Basically the Das_v3 with extra

EPYC Versus Xeon

The 7002 or “Rome” family of EPYC processors is AMD’s second generation of this type of processor. From everything I have read, this generation of the processor family firmly returns AMD back into the data centre.

I am not a hardware expert, but some things really stand out about the EPYC, which AMD claims is revolutionary about how it focuses on I/O, which pretty important for services such as databases (see the Ds_v3/Es_v3 core scenarios). EPYC uses PCI Gen 4 which is double the performance of Gen 3 which Intel still uses. That’s double the bus to storage … great for disk performance. The EPYC gets offers 45% faster RAM access than the Intel option … hence Microsoft’s choice for the HB_v2. If you want to get nerdy, then there are fewer NUMA nodes per socket, which reduces context switches for complex RAM v process placement scenarios.

Why AMD Now?

There have been rumours that Microsoft hasn’t been 100% happy with Intel for quite a while. Everything I heard was in the PC market (issues with 4th generation, battery performance, mobility, etc). I have not heard any rumours of discontent between Azure and Intel – in fact, the DC-Series virtual machine exists because of cooperation between the two giant technology corporations on SGX. But two things are evident:

  • Competition is good
  • Everything you read about AMD’s EPYC makes it sound like a genuine Xeon killer. As AMD says, Xeon is a BMW 3-series and EPYC is a Tesla – I hope the AMD build quality is better than the American-built EV!
  • As is often the case, the AMD processor is more affordable to purchase and to power – both big deals for a hosting/cloud company.

Choosing Between AMD and Xeon

OK, it was already confusing which machine to choose when deploying in Azure … unless you’ve heard me explain the series and specialisation meanings. But now we must choose between AMD and Intel processors!

I was up at 5 am researching so this next statement is either fuzzy or was dreamt up (I’m not kidding!): it appears that for multi-threaded applications, such as SQL Server, then AMD-powered virtual machines are superior. However, even in this age-of-the-cloud, single threaded applications are still running corporations. In that case, (this is where things might be fuzzy) an Intel Xeon-powered virtual machine might be best. You might think that single-threaded applications are a thing of the past but I recently witnessed the negative affect on performance of one of those – no matter what virtual/hardware was thrown at it.

The final element of the equation will be cost. I have no idea how the cost of the EPYC-powered machines will compare with the Xeon-powered ones. I do know that the AMD processor is cheaper and offers more threads per socket, and it should require less power. That should make it a cheaper machine to run, but higher consumption of IOs per machine might increase the cost to the hosting company (Azure). I guess we’ll know soon enough when the pricing pages are updated.

Migrating Azure Firewall To Availability Zones

Microsoft recently added support for availability zones to Azure firewall in regions that offer this higher level of SLA. In this post, I will explain how you can convert an existing Azure Firewall to availability zones.

Before We Proceed

There are two things you need to understand:

  1. If you have already deployed and configured Azure Firewall then there is no easy switch to turn on availability zones. What I will be showing is actually a re-creation.
  2. You should do a “dress rehearsal” – test this process and validate the results before you do the actual migration.

The Process

The process you will do will go as follows:

  1. Plan a maintenance window when the Azure Firewall (and dependent communications) will be unavailable for 1 or 2 hours. Really, this should be very quick but, as Scotty told Geordi La Forge, a good engineer overestimates the effort, leaves room for the unexpected, and hopefully looks like a hero if all goes to the unspoken plan.
  2. Freeze configuration changes to the Azure Firewall.
  3. Perform a backup of the Azure Firewall.
  4. Create a test environment in Azure – ideally a dedicated subscription/virtual network(s) minus the Azure Firewall (see the next step).
  5. Modify the JSON file to include support for availability zones.
  6. Restore the Azure Firewall backup as a new firewall in the test environment.
  7. Validate that that new firewall has availability zones and that the rules configuration matches that of the original.
  8. Confirm & wait for the maintenance window.
  9. Delete the Azure Firewall – yes, delete it.
  10. Restore the Azure Firewall from your modified JSON file.
  11. Validate the restore
  12. Celebrate – you have an Azure Firewall that supports multiple zones in the region.

Some of the Technical Bits

The processes of backing up and restoring the Azure Firewall are covered in my post here.

The backup is a JSON export of the original Azure Firewall, describing how to rebuild and re-configure it exactly as is – without support for availability zones. Open that JSON and make 2 changes.

The first change is to make sure that the API for deploying the Azure Firewall is up to date:

The next change is to instruct Azure which availability zones (numbered 1, 2, and 3) that you want to use for availability zones in the region:

And that’s that. When you deploy the modified JSON the new Azure Firewall will exist in all three zones.

Note that you can use this method to place an Azure Firewall into a single specific zone.

Costs Versus SLAs

A single zone Azure Firewall has a 99.95% SLA. Using 2 or 3 zones will increase the SLA to 99.99%. You might argue “what’s the point?”. I’ve witnessed a data center (actually, it was a single storage cluster) in an Azure region go down. That can have catastrophic results on a service. It’s rare but it’s bad. If you’re building a network where the Azure Firewall is the centre of secure, then it becomes mission critical and should, in my opinion, span availability zones, not for the contractual financial protections in an SLA but for protecting mission critical services.  That protection comes at a cost – you’ll now incur the micro-costs of data flows between zones in a region. From what I’ve seen so far, that’s a tiny number and a company that can afford a firewall will easily absorb that extra relatively low cost.