How to Troubleshoot Azure Routing?

This post will explain how routing works in Microsoft Azure, and how to troubleshoot your routing issues with Route Tables, BGP, and User-Defined Routes in your virtual network (VNet) subnets and virtual (firewall) appliances/Azure Firewall.

Software-Defined Networking

Right now, you need to forget VLANs, and how routers, bridges, routing switches, and all that crap works in the physical network. Some theory is good, but the practice … that dies here.

Azure networking is software-defined (VXLAN). When a VM sends a packet out to the network, the Azure Fabric takes over as soon as the packet hits the virtual NIC. That same concept extends to any virtual network-capable Azure service. From your point of view, a memory copy happens from source NIC to destination NIC. Yes; under the covers there is an Azure backbone with a “more physical” implementation but that is irrelevant because you have no influence over it.

So always keep this in mind: network transport in Azure is basically a memory copy. We can, however, influence the routing of that memory copy by adding hops to it.

Understand the Basics

When you create a VNet, it will have 1 or more subnets. By default, each subnet will have system routes. The first ones are simple, and I’ll make it even more simple:

  • Route directly via the default gateway to the destination if it’s in the same supernet, e.g. 10.0.0.0/8
  • Route directly to Internet if it’s in 0.0.0.0/0

By the way, the only way to see system routes is to open a NIC in the subnet, and click Effective Routes under Support & Troubleshooting. I have asked that this is revealed in a subnet – not all VNet-connected services have NICs!

And also, by the way, you cannot ping the subnet default gateway because it is not an appliance; it is a software-defined function that is there to keep the guest OS sane … and probably for us too 😊

When you peer a VNet with another VNet, you do a few things, including:

  • Instructing VXLAN to extend the plumbing of between the peered VNets
  • Extending the “VirtualNetwork” NSG rule security tag to include the peered neighbour
  • Create a new system route for peering.

The result is that VMs in VNet1 will send packets directly to VMs in VNet2 as if they were in the same VNet.

When you create a VNet gateway (let’s leave BGP for later) and create a load network connection, you create another (set of) system routes for the virtual network gateway. The local address space(s) will be added as destinations that are tunnelled via the gateway. The result is that packets to/from the on-prem network will route directly through the gateway … even across a peered connection if you have set up the hub/spoke peering connections correctly.

Let’s add BGP to the mix. If I enable ExpressRoute or a BGP-VPN, then my on-prem network will advertise routes to my gateway. These routes will be added to my existing subnets in the gateway’s VNet. The result is that the VNet is told to route to those advertised destinations via the gateway (VPN or ExpressRoute).

If I have peered the gateway’s VNet with other VNets, the default behaviour is that the BGP routes will propagate out. That means that the peered VNets learn about the on-premises destinations that have been advertised to the gateway, and thus know to route to those destinations via the gateway.

And let’s stop there for a moment.

Route Priority

We now have 2 kinds of route in play – there will be a third. Let’s say there is a system route for 172.16.0.0/16 that routes to virtual network. In other words, just “find the destination in this VNet”. Now, let’s say BGP advertises a route from on-premises through the gateway that is also for 172.16.0.0/16.

We have two routes for the 172.16.0.0/16 destination:

  • System
  • BGP

Azure looks at routes that clash like above and deactivates one of them. Azure always ranks BGP above System. So, in our case, the System route for 172.16.0.0/16 will be deactivated and no longer used. The BGP route for 172.16.0.0/16 via the VNet gateway will remain active and will be used.

Specificity

Try saying that word 5 times in a row after 5 drinks!

The most specific route will be chosen. In other words, the route with the best match for your destination is selected by the Azure fabric. Let’s say that I have two active routes:

  1. 16.0.0/16 via X
  2. 16.1.0/24 via Y

Now, let’s say that I want to send a packet to 172.16.1.4. Which route will be chosen? Route A is a 16 bit match (172.16.*.*). Route B is a 24 bit match (172.16.1.*). Route B is a closer match so it is chosen.

Now add a scenario where you want to send a packet to 172.16.2.4. At this point, the only match is Route A. Route B is not a match at all.

This helps explain an interesting thing that can happen in Azure routing. If you create a generic rule for the 0.0.0.0/0 destination it will only impact routing to destinations outside of the virtual network – assuming you are using the private address spaces in your VNet. The subnets have system routes for the 3 private address spaces which will be more specific than 0.0.0.0:

  1. 168.0.0/16
  2. 16.0.0/12
  3. 0.0.0/8
  4. 0.0.0/0

If your VNet address space is 10.1.0.0/16 and you are trying to send a packet from subnet 1 (10.1.1.0/24) to subnet 2 (10.1.2.0/24), then the generic Route D will always be less specific than the system route, Route C.

Route Tables

A route table resource allows us to manage the routing of a subnet. Good practice is that if you need to manage routing then:

  • Create a route table for the subnet
  • Name the route table after the VNet/subnet
  • Only use a route table with 1 subnet

The first thing to know about route tables is that you can control BGP propagation with them. This is especially useful when:

  • You have peered virtual networks using a hub gateway
  • You want to control how packets get to that gateway and the destination.

The default is that BGP propagation is allowed over a peering connection to the spoke. In the route table (Settings > Configuration) you can disable this propagation so the BGP routes are never copied from the hub network (with the VNet gateway) to the peered spoke VNet’s subnets.

The second thing about route tables is that they allow us to create user-defined routes (UDRs).

User-Defined Routes

You can control the flow of packets using user-defined routes. Note that UDRs outrank BGP routes and System Routes:

  1. UDR
  2. BGP routes
  3. System routes

If I have a system or BGO route to get to 192.168.1.0/24 via some unwanted path, I can add a UDR to 192.168.1.0/24 via the desired path. If the two routes are identical destination matches, then my UDR will be active and the BGP/system route will be deactivated.

Troubleshooting Tools

The traditional tool you might have used is TRACERT. I’m sorry, it has some use, but it’s really not much more than PING. In the software defined world, the default gateway isn’t a device with a hop, the peering connection doesn’t have a hop, and TRACERT is not as useful as it would have been on-premises.

The first thing you need is the above knowledge. That really helps with everything else.

Next, make sure your NSGs aren’t the problem, not your routing!

Next is the NIC, if you are dealing with virtual machines. Go to Effective Routes and look at what is listed, what is active and what is not.

Network Watcher has a couple of tools you should also look at:

  • Next Hop: This is a pretty simple tool that tells you the next “appliance” that will process packets on the journey to your destination, based on the actual routing discovered.
  • Connection Troubleshoot: You can send a packet from a source (VM NIC or Application Gateway) to a certain destination. The results will map the path taken and the result.

The tools won’t tell you why a routing plan failed, but with the above information, you can troubleshoot a (desired) network path.

Locking Down Network Access to the Azure Application Gateway/Firewall

In this post, I will explain how you can use a Network Security Group (NSG) to completely lock down network access to the subnet that contains an Azure Web Application Gateway (WAG)/Web Application Firewall (WAF).

The stops are as follows:

  1. Deploy a WAG/WAF to a dedicated subnet.
  2. Create a Network Security Group (NSG) for the subnet.
  3. Associate the NSG with the subnet.
  4. Create an inbound rule to allow TCP 65503-65534 from the Internet service tag to the CIDR address of the WAG/WAF subnet.
  5. Create rules to allow application traffic, such as TCP 443 or TCP 80, from your sources to the CIDR address of the WAG/WAF
  6. Create a low priority (4000) rule to allow any protocol/port from the AzureLoadBlanacer service tag to the CIDR address of the WAG/WAF
  7. Create a rule, with the lowest priority (4096) to Deny All from Any source.

The Scenario

It is easy to stand up a WAG/WAF in Azure and get it up and running. But in the real world, you should lock down network access. In the world of Azure, all network security begins with an NSG. When you deploy WAG/WAF in the real world, you should create an NSG for the WAG/WAF subnet and restrict the traffic to that subnet to what is just required for:

  • Health monitoring of the WAG/WAF
  • Application access from the authorised sources
  • Load balancing of the WAG/WAF instances

Everything else inbound will be blocked.

The NSG

Good NSG practice is as follows:

  1. Tiers of services are placed into their own subnet. Good news – the WAG/WAF requires a dedicated subnet.
  2. You should create an NSG just for the subnet – name the NSG after the VNet-Subnet, and maybe add a prefix or suffix of NSG to the name.

Health Monitoring

Azure will need to communicate with the WAG/WAF to determine the health of the backends – I know that this sounds weird, but it is what it is.

Note: You can view the health of your backend pool by opening the WAG/WAF and browsing to Monitoring > Backend Health. Each backend pool member will be listed here. If you have configured the NSG correctly then the pool member status should be “Healthy”, assuming that they are actually healthy. Otherwise, you will get a warning saying:

Unable to retrieve health status data. Check presence of NSG/UDR blocking access to ports 65503-65534 from Internet to Application Gateway.

OK – so you need to open those ports from “Internet”. Two questions arise:

  • Is this secure? Yes – Microsoft states here that these ports are “are protected (locked down) by Azure certificates. Without proper certificates, external entities, including the customers of those gateways, will not be able to initiate any changes on those endpoints”.
  • What if my WAG/WAF is internal and does not have a public IP address? You will still do this – remember that “Internet” is everything outside the virtual network and peered virtual networks. Azure will communicate with the WAG/WAF via the Azure fabric and you need to allow this communication that comes from an external source.

In my example, my WAF subnet CIDR is 10.0.2.4/24:

Application Traffic

Next, I need to allow application traffic. Remember that the NSG operates at the TCP/UDP level and has no idea of URLs – that’s the job of the WAG/WAF. I will use the NSG to define what TCP ports I am allowing into the WAG/WAF (such as TCP 443) and from what sources.

In my example, the WAF is for internal usage. Clients will connect to applications over a VPN/ExpressRoute connection. Here is a sample rule:

If this was an Internet-facing WAG or WAF, then the source service tag would be Internet. If other services in Azure need to connect to this WAG or WAF, then I would allow traffic from either Virtual Network or specific source CIDRs/addresses.

The Azure Load Balancer

To be honest, this one caught me out until I reasoned what the cause was. My next rule will deny all other traffic to the WAG/WAF subnet. Without this load balancer rule, the client could not connect to the WAG/WAF. That puzzled me, and searches led me nowhere useful. And then I realized:

  • A WAG/WAF is 1+ instances (2+ in v2), each consuming IP addresses in the subnet.
  • They are presented to clients as a single IP.
  • That single IP must be a load balancer
  • That load balancer needs to probe the load balancer’s own backend pool – which are the instance(s) of the WAG/WAF in this case

You might ask: isn’t there a default rule to allow a load balancer probe? Yes, it has priority 65001. But we will be putting in a rule at 4096 to prevent all connections, overriding the 65000 rule that allows everything from VirtualNetwork – which includes all subnets in the virtual network and all peered virtual networks.

The rule is simple enough:

Deny Everything Else

Now we will override the default NSG rules that allow all communications to the subnet from other subnets in the same VNet or peered VNets. This rule should have the lowest possible user-defined priority, which is 4096:

Why am I using the lowest possible priority? This is classic good firewall rule practice. General rules should be low priority, and specific rules should be high priority. The more general, the lower. The more specific, the higher. The most general rule we have in firewalls is “block everything we don’t allow”; in other words, we are creating a white list of exceptions with the previously mentioned rules.

The Results

You should end up with:

  • The health monitoring rule will allow Azure to check your WAG/WAF over a certificate-secured channel.
  • Your application rules will permit specified clients to connect to the WAG/WAF, via a hidden load balancer.
  • The load balancer can probe the WAG/WAF and forward client connections.
  • The low priority deny rule will block all other communications.

Job done!

 

Why Choose the Azure Firewall over a Virtual Firewall Appliance?

In this post, I will explain why you should choose Azure Firewall over third-party firewall network virtual appliances (NVAs) from the likes of Cisco, Palo Alto, Check Point, and so on.

Microsoft’s Opinion

Microsoft has a partner-friendly line on Azure Firewall versus third-parties. Microsoft says that third-party solutions offer more than Azure Firewall. If you want you can use them side-by-side.

Now that’s out of the way, let me be blunt … like I’d be anything else! 😊

The NVA Promise

At their base, a firewall blocks or allows TCP/UDP/etc and does NAT. Some firewalls offer a “security bundle” of extra features such as:

  • Malware scanning based on network patterns
  • Download scanning, including zero-days (detonation chamber)
  • Browser URL logging & filtering

But those cool things either make no sense in Azure or are just not available from the NVA vendors in their cloud appliances. So what you are left with is central logging and filtering.

Documentation

With the exception of Palo Alto (their whitepaper for Azure is very good – not perfect) and maybe Check Point, the vendors have pretty awful documentation. I’ve been reading a certain data centre mainstay’s documents this week and they are incomplete and rubbish.

Understanding of Azure

It’s quite clear that some of the vendors are clueless about The Cloud and/or Azure. Every single vendor has written docs about deploying everything into a single VNet – if you can afford NVAs then you are not putting all your VMs into a single VNet (see hub & spoke VNet peering). Some have never heard of availability zones – if you can afford NVAs then you want as high an SLA as you can get. Most do not offer scale-out (active/active clusters) – so a single VM becomes your bottleneck on VM performance (3000 Mbps in a D3_v2). Some don’t even support highly available firewall clusters – so a single VM becomes the single point of failure in your entire cloud network! And their lack of documentation or understanding of VNet peering or route tables in a large cloud deployment is laughable.

The Comparison

So, what I’m getting at is that the third-party NVAs suck. Azure Firewall isn’t perfect either, but it’s a true cloud platform service and it is improving fast – just last night Microsoft announced Threat Intelligence-Based Filtering and Service Tags Filtering (this appeared recently). I know more things are on the way too 😊

Here is my breakdown of how Azure Firewall stacks up against firewall NVAs:

Azure Firewall NVA
Deployment Platform Linux VM + Software
Licensing Consumption: instance + GB Linux VM + Software
Scaling Automatic Add VMs + Software
Ownership Set & monitor Manage VM / OS / Software
Layer -7 Logging & filtering Potentially* deep inspection
Networking 1 subnet & PIP 1+ subnets & 1 PIP
Complexity Simple Difficult

I know: you laugh when you hear “Microsoft” and “Firewall” in the same sentence. You think of ISA Server. Azure Firewall is different. This is baked into the fabric of Azure, the strategic future of Microsoft. It is already rapidly improving, and it does more than the third parties.

Heck, what does the third-party offer compared to NSGs? NSGs filter TCP/UDP, they can log to a storage account, you can centrally log using Event Hubs, and does advanced reporting/analysis using NSG Flo Logs with Azure Monitor Logs (Log Analytics). Azure Firewall takes that another step with a hub deployment, an understanding of HTTP/S, and is now using machine learning for dynamic threat prevention!

My Opinion

Some people will always prefer a non-Microsoft firewall. But my counter would be, what are you getting that is superior – really? With Azure Firewall, I create a firewall, set my rules, configure my logging, and I’m done. Azure Firewall scales and it is highly available. Logging can be done to storage accounts, event hubs (SIEM), and Azure Monitor Logs. And here’s the best bit … it is SIMPLE to deploy and there is almost no cost of ownership. Compare that to some of the HACK solutions from the NVA vendors and you’d laugh.

The Azure Firewall was designed for The Cloud. It was designed for the way that Azure works. And it was designed for how we should use The Cloud … at scale. And that scale isn’t just about Mbps, but in terms of backend services and networks. From what I have seen so far, the same cannot be said for firewall NVAs. For me, the decision is easy: Azure Firewall. Every time.

I Have Joined Innofactor

As I posted before, I finished my previous job with MicroWarehouse before Christmas. On January 3rd, I joined Innofactor.

Who Are Innofactor?

Innofactor is an IT consulting and services company that operates in the Nordic countries: Finland (HQ), Denmark, Sweden, and Norway. If you attend conferences or user groups, then there’s a good chance you’ve seen some of their employees speaking; Innofactor has a number of MVPs on their books including (but not limited to) Damian Flynn (also Irish), Olav Tvedt, Alexander Rodland, and Stefan Shorling. Quite simply, Innofactor is the A-Game in the Nordic countries, if not Europe.

I’ve known of Innofactor for quite a while. Lumagate was the company I knew first, but they were acquired by the Finnish company in 2016. My first contact with Lumagate was through Kristian Nese, now a part of AzureCAT in Microsoft. Kristian was the technical reviewer of Private Cloud Computing which I co-wrote (I did a tiny part) with other MVPs, including Damian Flynn. Damian joined Innofactor a few years ago – I’ve known Damian since he first became a Hyper-V MVP many moons ago. More often than not, we are roommates at the MVP summit. I got to know Olav through bringing him to Ireland to run EMS training for MicroWarehouse’s customers. It’s through these connections, and eventually meeting some of the other MVPs that I got to know what Innofactor is and wants to do.

My Role

My title is Principal Consultant, but I’m doing more than just project work. I don’t want to discuss that publicly yet … it’s a strategic thing 🙂 I’ll say this: the project work I have done so far has been quite cool, and the other thing I’m doing gives me an opportunity to work with great people across the Nordic countries. I’ve had a lot of meetings in my first 7 working days and we’ve figured out a lot of stuff that I thought might have taken weeks or months.

I am not moving to Norway. In fact, I spent a few days over the Christmas holidays working on my home office. It’s not finished yet – I want to put in a new desk and do a little bit more painting, but it’s getting there. Yes; I am working from home – the commute is wicked tough 😉 Almost all of my communications and collaboration is done through Microsoft Teams, something one of the directors pushed out before I joined. I was skeptical at first but it works well. And because the work we do is in The Cloud, we can do it from anywhere.

I said Innofactor was the A-Game. If you’ve seen any of the consultants present, then you know what I’m talking about. The conversations that I’m having on a daily basis are all state-of-the-art. What lies ahead of me will be both challenging and amazing.

Culture Difference

I have worked for American, German and Irish companies, including corporates, finance, and startup/small/medium businesses. My last employed, MicroWarehouse, was very flexible – the MD is very supportive of staff and deals with family issues in a thoughtful manner. But I have never experienced anything like working for a Norwegian company – maybe it’s an Innofactor thing, maybe it’s a Nordic or Scandinavian thing, I don’t know. But they take life-work (notice the order) balance very seriously. There is work, but most of your time is life. Don’t get me wrong, work has to be done and it has to be done well, but there is no doubt in your communications that no one is expecting 18 hour days.

Working from home has huge advantages, especially when you combine it with flexitime. I start work at 8 am most days, earlier on Fridays, and I’m able to finish earlier. I have more time at home with my family, I am here in case something needs to be delivered or done at the house (very useful last week when a new heating furnace was required in an emergency), and I can use my flexitime to deal with things for the kids like dental appointments, sports, or events. It really is life changing.

A hard part of working from home is being disciplined. It is so easy to say “I’ll work in the sitting room and stick on the TV for background noise”. Next thing you know, you’re binge-watching Netflix! I’ve heard home workers offer all kinds of solutions. Some get dressed for work, get in their car, drive around the block and then start work in their home! I think the most important thing to do is to create a workplace.

The small (box) room in our house was set up as an office when we moved in. It required some work over the holidays, but I tidied it up and got it ship-shape for the new job. This is where I work – nowhere else. When I come in here, the door shuts behind me and I am in work mode. I even use a “work laptop”, not my personal PC. That virtual line puts me in work mode. And it works – after 7 working days, today was actually the first time I’ve ever logged into Facebook on this machine.

Future

The scale of work and types of customers that I am working with are very different than I did over the previous 7 years. Customers are bigger, and they are what we in MicroWarehouse called “end customers”, not partners/resellers. This means I’ll have new things/scenarios to talk about. Already, I’ve got some new talks ideas floating around my head 🙂

Leaving MicroWarehouse

It’s with great sadness that I am announcing that this is my last day at MicroWarehouse, a company that I have enjoyed working with for over 7 years.

I joined MWH in 2011. My role was to work with Irish Microsoft partners, mostly in the small/medium business space, to grow the System Center business. My work with Hyper-V was a gateway to System Center and, at first, this worked … until Microsoft changed the licensing of System Center & enterprise agreements which killed that business for us.

Nearly 5 years ago my role changed from on-premises to The Cloud, when Microsoft asked us to take the lead on growing Azure business in the breadth market. I logged into Azure for the first time and started learning– for 9 months before I spoke to my first customer. And I’ve been learning every day since because that is the nature of Azure.

The great thing about working for MicroWarehouse was the people. The company has a family feel about it. We are literally thrown out of the office if we’re still in the building at 17:45! And the MD gives out to us for answering email at night! The company had my back when I went through some bad times, giving me time off to deal with things. I can even say that MWH changed my life, because one day this new hire sat at the desk beside me, and eventually married me 😊

The staff in MWH are the very best at what they do. The sales team totally know their stuff, and no one can match them. Any little tricks I know about licensing come from Rob, Angela, and Nicole. Our core sales team knows every little nook and cranny – you want to know about Surface or it’s accessories then they know the lot. The accounts and logistics team are all over everything – they sort out so many problems in Microsoft that customers could never comprehend – they’re the ninjas behind the scenes. And, of course, there’s the Marketing team; every time I’ve been speaking at an MWH event, training course or webinar, they’re the ones running things.

In short, if you are a Microsoft partner operating in Ireland, Northern or Republic of, then there’s no better choice than MicroWarehouse as your distributor or CSP Indirect. We’ve gone up against the best the UK has to offer and they are no match – there’s a reason we crushed the competition! Even though my time here is ending, I hope that MWH continues to thrive.

I’ve also made lots of friends in the Microsoft and MS partner world. My job was to engage with partners and that’s what I did … a lot. I’ve met some impressive people over the years and I’ve learned a lot from our customers too. I’m going to miss that.

I don’t view my departure as me leaving MWH. I didn’t look to leave, I never entered any job hunting process. I was really happy here. Instead, I’m going to something … I’ll share that soon.

Thank you MicroWarehouse. I’ve had an amazing time.

My New Intel NUC PC

I recently purchased an Intel NUC, NUC8i7HNK, to use as my home office PC. Here’s a little bit of information about my experience.

Need To Upgrade

I’ve been using a HP micro-tower for around 6-7 years as my home office PC. It was an i5 with 16 GB RAM, originally purchased as part of a pair to use as a lab kit when I started writing Mastering Hyper-V 2012 R2. After that book, I re-purposed the machine as my home PC and it’s been where many of my articles were written and where I work when I work from home.

When Microsoft introduced a workaround security fix for Meltdown/Spectre I noticed the slowdown quite a bit. Over the year, the PC has just felt slower and slower. I don’t do anything that unusual with it, I don’t use it for development, it’s not running Hyper-V – Office, Chrome, Visio, and VS Code are my main tools of the trade. The machine is 6-7 years old, so it was time to upgrade.

Options

Some will ask “wasn’t the Surface Studio the perfect choice?”. No, not for me. The price is crazy, the Studio 1 needs a hard disk replacement, the Studio 2 isn’t available yet, and I need a nice dual monitor supply and I don’t like working with mismatched monitors – Microsoft doesn’t make additional matching monitors for the Studio.

I did look at Dell/Lenovo/HP but nothing there suited me. Some were too lower spec. Some had the spec but a Surface-like price to go with them. I considered home-builds. Most of the PC’s I have owned have been either home-built or customised. But I don’t have time for that malarkey. I looked at custom-builds but they are expensive options for gamers – I don’t have time to play the X-Box games that I already have.

At work, we use Intel NUCs for our training room. They’re small, high spec, and have an acceptable price. So that’s what I went for.

NUC8i7HNK

One of my colleagues showed me some of the new 8th generation NUC models and I opted for the NUC8i7HNK (Amazon USA / Amazon UK). A machine with an i7, Radeon graphics instead of the usual Intel HD, USB C and Thunderbolt, TPM 2.0 (not listed on the Intel site, I found), and oodles of ports. Here’s the front:

image

And here’s the back:

image

Look at that: 2 x HDMI, 2 x mini-DP, USB C, 6 x USB 3.0, 2 x Ethernet, and there’s the Radeon graphics, speaker, built-in mic, and more. It supports 2 x M.2/NVMe disks and 2 x DIMM slots for up to 32 GB RAM.

The machine is quite tidy and small. It comes with a plate allowing you to mount it to the back of a monitor – if the monitor supports mounting.

My Machine

The NUC kits come built, but you have to add your disk and RAM. I went with:

  • Adata SX6000 M.2 SSD, capable of up to 1000 MB/S read and 800 MB/S write.
  • 2 x Adata DDR4 2400 8 GB RAM

I installed Windows 10 1809 in no time, added the half dozen or so required Windows updates, and installed Office 365 from the cloud. A quick blast of Ninite and I had most of the extra bits that I needed. In terms of data, all of my docs are is either in OneDrive or Office 365 so there was no data migration. My media (movies & photos) are on a pair of USB 3.0 drives configured with Storage Spaces so all I’ll have to do is move the drives over. To be honest, the biggest thing I have to do is buy a pair of video cables to replace my old ones!

Going with a smaller machine will clear up a lot of space from under my desk, and help reduce some of the clutter – there’s a lot of clutter to clear!

Cloud Camp 2018 – It’s A Wrap!

Yesterday, Cloud Camp 2018, run by MicroWarehouse and sponsored by Microsoft Surface and Veeam, ran in the Dublin Convention Centre here in Ireland. 4 tracks, 20 (mostly MVP) sessions, 2 keynotes, and hundreds of satisfied attendees. It was great fun – but we’re all a little tired today Smile

Photo by Gregor Reimling

The message of the day was “change” and that was what I talked about in the opening keynote. In nature, change is inevitable. In IT, you cannot accept change, you’re pushed aside. Business pressure, security & compliance needs, and the speed of cloud make change happen faster than ever. And that’s why we had 20 expert-lead breakout sessions covering Azure IaaS, Azure PaaS, productivity, security, management & governance, Windows Server 2019 and hybrid cloud solutions. The conference ended with renowned Microsoft-watchers Mary Jo Foley and Paul Thurrott discussing what the corporation has been up to and their experiences in covering the Redmond giant.

We had a lot of fun yesterday. Everything ran quite smoothly – credit to John & Glenn in MWH and Hanover Communications.

After the conference, Paul & Mary Jo hosted their Windows Weekly podcast from Dogpatch Labs in the IFSC.

And then we had a small after party in Urban Brewing next door, where one or two beverages might have been consumed until the wee hours of the morning Smile

Picture by Gerald Versluis

Thank you to:

  • MicroWarehouse for running this event – Rory for OK-ing it and the team for promoting it.
  • John and Glenn who ran the logistics and made it so smooth
  • Hanover Communications for the PR work
  • All the breakout speakers who travelled from around Ireland/Europe to share their knowledge and experience
  • Kartik who travelled from India to share what Azure Backup are up to
  • Paul & Mary Jo for travelling from the USA to spend some time with us
  • Alex at TWiT for make sure things worked well with the podcast
  • Everyone who attended and made this event possible!

A Twitter competition with the #CloudCamp18 tag was run – a winner will be selected (after the dust settles) for a shiny new Surface Go. At one point the #CloudCamp18 tag was trending #3 for tweets in Dublin. Now I wonder what will happen with #CloudCamp19?

Generation 2 Virtual Machines Make Their First Public Appearance in Microsoft Azure

Microsoft has revealed that the new preview series of confidential computing virtual machines, the DC-Series, which went into public preview overnight are based on Generation 2 (Gen 2) Hyper-V virtual machines. This is the first time that a non-Generation 1 (Gen 1) VM has been available in Azure.

Note that ASR allows you to migrate/replicate Generation 2 machines into Azure by converting them into Generation 1 at the time of failover.

These confidential compute VMs use hardware features of the Intel chipset to provide secure enclaves to isolate the processing of sensitive data.

The creation process for a DC-Series is a little different than usual – you have to look for Confidential Compute VM Deployment in the Marketplace and then you work through a (legacy blade-based) customised deployment that is not as complete as a normal virtual machine deployment. In the end a machine appears.

I’ve taken a screenshot from a normal Azure VM including a view of Device Manager from Windows Server 2016 with the OS disk.

image

Note that both the OS disk and the Temp Drive are IDE drives on a Virtual HD ATA controller. This is typical a Generation 1 virtual machine. Also note the IDE/ATA controller?

Now have a look at a DC-Series machine:

image

Note how the OS disk and the Temp Drive are listed as Microsoft Virtual Disk on SCSI controllers? Ah – definitely a Generation 2 virtual machine! Also do you see the IDE/ATA controller is missing from the device listing? If you expand System Devices you will find that the list is much smaller. For example, the Hyper-V S3 Cap PCI bus video controller (explained here by Didier Van Hoye) of Generation 1 is gone.

Did you Find This Post Useful?

If you found this information useful, then imagine what 2 days of training might mean to you. I’m delivering a 2-day course in Frankfurt on December 3-4, teaching newbies and experienced Azure admins about Azure Infrastructure. There’ll be lots of in-depth information, covering the foundations, best practices, troubleshooting, and advanced configurations. You can learn more here.

Windows Server 2019 Did Not RTM – And Why That Matters

I will start this article by saying there is a lot in Windows Server 2019 to like. There are good reasons to want to upgrade to it or deploy it – if I was still in the on-premises server business I would have been downloading the bits as soon as they were shared.

As you probably know Microsoft has changed the way that they develop software. It’s done in sprints and the goal is to produce software and get it into the hands of customers quickly. It doesn’t matter if it’s Azure, Office 365, Windows 10, or Windows Server, the aim is the same.

This release of Windows Server is the very first to go through this process. When Microsoft announced the general availability of Windows Server 2019 on October 2nd, they shared those bits with everyone at the same time. Everyone – including hardware manufacturers. There was no “release to manufacturing” or RTM.

In the past, Microsoft would do something like this:

  1. Microsoft: Finish core development.
  2. Microsoft: RTM – share the bits privately with the manufacturers.
  3. Microsoft: Continue quality work on the bits.
  4. Manufacturing: Test & update drivers, firmware, and software.
  5. Microsoft & Manufacturing: Test & certify hardware, drivers & firmware for the Windows Server Catalog, aka the hardware compatibility list or HCL.
  6. Microsoft: 1-3 months after RTM, announce general availability or GA
  7. Microsoft: Immediately release a quality update via Windows Update

This year, Microsoft has gone straight to step 6 from the above to get the bits out to the application layer as quickly as possible. The OEMs got the bits the same day that you could have. This means that the Windows Server Catalog, the official listing of all certified hardware, is pretty empty. When I looked on the morning of Oct 3, there was not even an entry for Windows Server 2019 on it! Today (October 4th) there are a handful of certified components and 1 server from an OEM I don’t know:

image

So my advice is, sure, go ahead and download the bits to see what Microsoft has done. Try out the new pieces and see what they offer. But hold off on production deployments until your hardware appears on this list.

I want to be clear here – I am not bashing anyone. I want you to have a QUALITY Windows Server experience. Too often in the past, I have seen people blame Windows/Hyper-V for issues when the issues were caused by components – maybe some of you remember the year of blue screens that Emulex caused for blade server customers running Windows Server 2012 R2 because of bad handling of VMQ in their converged NICs driver & firmware?

In fact, if you try out the software-defined features, Network Controller and Storage Spaces Direct (S2D), you will be told that you can’t try them out without opening a free support call to get a registry key – which someone will eventually share online. This is because those teams realize how dependent they are on hardware/driver/firmware quality and don’t want you judging their work by the problems of the hardware. The S2D team things the first wave of certified “WSSD” hardware will start arriving in January.

Note: VMware, etc, should be considered as hardware. Don’t go assuming that Windows Server 2019 is certified on it yet – wait for word from your hypervisor’s manufacturer.

Why would Microsoft do this? They want to get their software into application developers hands as quickly as possible. Container images based on Windows Server will be smaller than ever before – but they’re probably on the semi-annual channel so WS2019 doesn’t mean much to them. Really, this is for people running Windows Server in a cloud to get them the best application platform there is. Don’t start the conspiracy theories – if Microsoft had done the above process then none of us would be seeing any bits maybe until January! What they’ve effectively done is accelerate public availability while the Windows Server Catalog gets populated.

Have fun playing with the new bits, but be careful!

Microsoft Ignite 2018: Implement Cloud Backup & Disaster Recovery At Scale in Azure

Speakers: Trinadh Kotturu, Senthuran Sivananthan, & Rochak Mittal

Site Recovery At Scale

Senthuran Sivananthan

WIN_20180927_14_18_30_Pro

Real Solutions for Real Problems

Customer example: Finastra.

  1. BCP process: Define RPO/RTO. Document DR failover triggers and approvals.
  2. Access control: Assign clear roles and ownership. Levarage ASR built-in roles for RBAC. Different RS vault for different BU/tenants. They deployed 1 RSV per app to do this.
  3. Plan your DR site: Leveraged region pairs – useful for matching GRS replication of storage. Site connectivity needs to be planned. Pick the primary/secondary regions to align service availability and quota availability – change the quotas now, not later when you invoke the BCP.
  4. Monitor: Monitor replication health. Track configuration changes in environment – might affect recovery plans or require replication changes.
  5. DR drills: Periodically do test failovers.

Journey to Scale

  • Automation: Do things at scale
  • Azure Policy: Ensure protection
  • Reporting: Holistic view and application breakdown
  • Pre- & Post- Scripts: Lower RTO as much as possible and eliminate human error

Demos – ASR

Rochak for demos of recent features. Azure Policies coming soon.

WIN_20180927_14_33_20_Pro

Will assess if VMs are being replicated or not and display non-compliance.

Expanding the monitoring solution.

Demo – Azure Backup & Azure Policy

Trinadh creates an Azure Policy and assigns it to a subscription. He picks the Azure Backup policy definition. He selects a resource group of the vault, selects the vault, and selects the backup policy from the vault. The result is that any VM within the scope of the policy will automatically be backed up to the selected RSV with the selected policy.

Azure Backup & Security

Supports Azure Disk Encryption. KEK and BEK are backed up automatically.

AES 256 protects the backup blobs.

Compliance

  • HIPAA
  • ISO
  • CSA
  • GDPR
  • PCI-DSS
  • Many more

Built-in Roles

Cumulative:

  • Backup reader – see only
  • Backup Operator: Enable backup & restore
  • Backup contributor: Policy management and Delete-Stop Backup

Protect the Roles

PIM can be used to guard the roles – protect against rogue admins.

  • JIT access
  • MFA
  • Multi-user approval

Data Security

  • PIN protection for critical actions, e.g. delete
  • Alert: Notification on critical actions
  • Recovery: Data kept for 14 days after delete. Working on blob soft delete

Backup Center Demo

Being built at the moment. Starting with VMs now but will include all backup items eventually.

WIN_20180927_15_06_47_Pro

All RSVs in the tenant (doh!) managed in a central place.

Aimed at the large enterprise.

They also have Log Analytics monitoring if you like that sort of thing. I’m not a fan of LA – I much prefer Azure Monitor.

Reporting using Power BI

Trinadh demos a Power BI reporting solution that unifies backup data from multiple tenants into a single report.