Understanding the Azure Image Builder Resources

In this post, I will explain the roles of and links/connections between the various resources used by Azure Image Builder.

Background

I enjoy the month of July. My customers, all in the Nordics, are off for the entire month and I am working. This year has been a crazy busy one so far, so there has been almost no time in the lab – noticeable I’m sure by my lack of writing. But this month, if all goes to plan, I will have plenty of time in the lab. As I type, a pipeline is deploying a very large lab for me. While that runs, I’ve been doing some hands on lab work.

Recently I helped develop and use an image building process, based on Packer, to regularly create images for a Citrix farm hosted in Microsoft Azure. It’s a pretty sweet solution that is driven from Azure DevOps and results in a very automated deployment that requires little work to update app versions or add/remove apps. At the time, I quickly evaluated Azure Image Builder (also based on Packer but still in Preview back then) but I thought it was too complicated and would still require the same pieces as our Packer solution. But I did decide to come back to Azure Image Builder when there was time (today) and have another look.

The first mission – figure out the resource complexity (compared to Packer by itself).

The Resources

I believe that one of Microsoft’s failings when documenting these services is their inability to explain the functions of the resources and how they work together. Working primarily in ARM templates, I get to see that stuff (a little). I’ve always felt that understanding the underlying system helps with understanding the solution – it was that way with Hyper-V and that continues with Azure.

Managed Identity – Microsoft.ManagedIdentity/userAssignedIdentities

A managed identity will be used by an Image Template to authorise Packer to use the imaging process that you are building. A custom role is associated with this Managed Identity, granting Packer rights to the resource group that the Shared Image Gallery, Image Definition, and Image Template are stored in.

Shared Image Gallery – Microsoft.Compute/galleries/images

The Shared Image Gallery is the management resource for images. The only notable attribute in the deployment is the name of the resource, which sadly, is similar to things like Storage Accounts in lacking standardisation with the rest of Microsoft Azure resource naming.

Image Definition- Microsoft.Compute/galleries/images

The Image Definition documents your image as you would like to present it to your “customers”.

The Image Definition is associated with the Shared Image Gallery by naming. If your Shared Image Gallery was named “myGallery” then an image definition called “myImage” would actually be named as “myGallery/myImage”.

The properties document things including:

  • VM generation
  • OS type
  • Generalised or not
  • How you will brand the images build from the Image Definition

Image Template – Microsoft.VirtualMachineImages/imageTemplates

This is where you will end up spending most of your time while operating the imaging process over time.

The Image Template describes to Packer (hidden by Azure) how it will build your image:

  • Identity points to the resource ID of the Managed Identity, permitting Packer to sign in as that identity/receiving its rights when using this Image Template to build an Image Version.
  • Properties:
    • Source: The base image from the Azure Marketplace to start the build with.
    • Customize: The tasks that can be run, including PowerShell scripts that can be downloaded, to customise the image, including installing software, configuring the OS, patching and rebooting.
    • Distribute: Here you associate the Image Template with an Image Definition, referencing the resource ID of the desired Image Definition. Everytime you run this Image Template, a new Image Version of the Image Definition will be created.

Image Version – Microsoft.Compute/galleries/images/versions

An Image Version, a resource with a messy resource name that will break your naming standards, is created when you build from an Image Template. The name of the Image Version is based on the name of the Image Definition plus an incremental number. If my Image Definition is named “myGallery/myImage” then the Image Version will be named “myGallery/myImage/<unique number>”.

The properties of this resource include a publishing profile, documenting to what regions an image is replicated and how it is stored.

What Is Not Covered

Packer will create a resource group and virtual machine (and associated resources) to build the new image. The way that the virtual machine is networked (public IP address by default) can normally be manipulated by the Image Template when using Packer.

Summary

There is a lot more here than with a simple run of Packer. But, Azure Image Builder provides a lot more functionality for making images available to “customers” across an enterprise-scale deployment; that’s really where all the complexity comes from and I guess “releasing” is something that Microsoft knows a lot about.

 

Building Azure VM Images Using Packer & Azure Files

In this post, I will explain how I am using a freeware package called Packer to create SYSPREPed/generalised templates for Citrix Cloud / Windows Virtual Desktop (WVD) – including installing application/software packages from Azure Files.

My Requirement

Sometimes you need an image that you can quickly deploy. Maybe it’s for a scaled-out or highly-available VM-based application. Maybe it’s for a Citrix/Windows Virtual Desktop worker pool. You just need a golden image that you will update frequently (such as for Windows Updates) and be able to bring online quickly.

One approach is to deploy a Marketplace image into your application and then use some deployment engine to install the software. That might work in some scenarios, but not well (or at all) in WVD or Citrix Cloud scenarios.

A different, and more classic approach, is to build a golden image that has everything installed and then the  VM is generalised to create an image file. That image file can be used to create new VMs – this is what Citrix Cloud requires.

Options

You can use classic OS deployment tools as a part of the solution. Some of us will find familiarty in these tools but:

  • Don’t waste your time with staff under the age of 40
  • These tools aren’t meant for the cloud – you’ll have to daisy chain lots of moving parts, and that means complex failure/troubleshooting.

Maybe you read about Azure Image Builder? Surely, using a native image building service is the way to go? Unfortunately: no. AIB is a preview, driven by scripting, and it fails by being too complex. But if you dig into AIB, you’ll learn that it is based on a tool called Packer.

Packer

Packer, a free tool from Hashicorp, the people behind Terraform, is a simple command line tool that will allow you to build VM images on a number of platforms, including Azure ARM. The process is simple:

  • You build a JSON file that describes the image building process.
  • You run packer.exe to ingest that JSON file and it builds the image for you on your platform of choice.

And that’s it! You can keep it simple and run Packer on a PC or a VM. You can go crazy and build a DevOps routine around Packer.

Terminology

There are some terms you will want to know:

  • Builders: These are the types of builds that Packer can do – the platforms that it can build on. Azure ARM is the one I have used, but there’s a more complex/faster Builder for Azure called chroot that uses an existing build VM to build directly into a managed disk. Azure ARM builds a temporary VM, configures the OS, generalises it, and converts it into an image.
  • Provisioners: These are steps in the build process that are used to customise your operating system. In the Windows world, you are going to use the PowerShell provisioner a lot. You’ll find other built in provisioners for Ansible, Puppet, Chef, Windows Restart and more.
  • Custom/Community Provisioners: You can build additional provisioners. There is even a community of provisioners.

Accursed Examples

If you search for Windows Packer JSON Files, you are going to find the same file over and over. I did. Blog posts, powerpoints, training materials, community events – they all used the same example: Deploy Windows, install IIS, capture an image. Seriously, who is ever going to want an image that is that simple?

My Requirement

I wanted to build a golden image, a template, for a Citrix worker pool, running in Azure and managed by Citrix Cloud. The build needs to be monthly, receiving the latest Windows Updates and application upgrades. The solution should be independent of the network and not require any file servers.

Azure Files

The last point is easy to deal with: I put the application packages into Azure Files. Each installation is wrapped in a simple PowerShell script. That means I can enable a PowerShell provisioner to run multiple scripts:

      “type”: “powershell”,
      “scripts”: [
        “install-adobeReader.ps1
        “install-office365ProPlus.ps1”
      ]
This example requires that the two scripts listed in the array are in the same folder as packer.exe. Each script is run in turn, sequentially.

Unverified Executables

But what if one of those scripts, like Office, wants to run a .exe file from Azure Files? You will find that the script will stall while a dialog “appears” (to no one) on the build VM stating that “we can’t verify this file” and waits for a human (that will never see the dialog) to confirm execution. One might think “run unlock-file” but that will not work with Azure Files. We need to update HKEY_CURRENT_USER (which will be erased by SYSPREP) to truse EXE files from the FQDN of the Azure Fils share. There are two steps to this, which we solve by running another PowerShell provisioner:
    {
      “type”: “powershell”,
      “scripts”: [
        “permit-drive.ps1”
      ]
    },
That script will run two pieces of code. The first will add the FQDN of the Azure Files share to Trusted Sites in Internet Options:

set-location “HKCU:\Software\Microsoft\Windows\CurrentVersion\Internet Settings\ZoneMap\Domains”
new-item “windows.net”
set-location “Windows.net”
new-item “myshare.file.core”
set-location “myshare.file.core”
new-itemproperty . -Name https -Value 2 -Type DWORD

The second piece of code will trust .EXE files:

set-location “HKCU:\Software\Microsoft\Windows\CurrentVersion\Policies”
new-item “Associations”
set-location “Associations”
new-itemproperty . -Name LowRiskFileTypes -Value ‘.exe’ -Type STRING

SYSPREP Stalls

This one wrecked my head. I used an inline PowerShell provisioner to add Windows roles & features:

      “type”: “powershell”,
      “inline”: [
        “while ((Get-Service RdAgent).Status -ne ‘Running’) { Start-Sleep -s 5 }”,
        “while ((Get-Service WindowsAzureGuestAgent).Status -ne ‘Running’) { Start-Sleep -s 5 }”,
        “Install-WindowsFeature -Name Server-Media-Foundation,Remote-Assistance,RDS-RD-Server -IncludeAllSubFeature”
      ]
But then the Sysprep task at the end of the JSON file stalled. Later I realised that I should have done a reboot after my roles/features add. And for safe measure, I also put one in before the Sysprep:
    {
      “type”: “windows-restart”
    },
You might want to run Windows Update – I’d recommend it at the start (to patch the OS) and at the end (to patch Microsoft software and catch any missing OS updates). Grab a copy of the community Windows-Update provisioner and place it in the same folder as Packer.exe. Then add this provisioner to your JSON – I like how you can prevent certain updates with the query:
    {
      “type”: “windows-update”,
      “search_criteria”: “IsInstalled=0”,
      “filters”: [
        “exclude:$_.Title -like ‘*Preview*'”,
        “include:$true”
      ]
    },

Summary

Why I like Packer is that it is simple. You don’t need to be a genius to make it work. What I don’t like is the lack of original documentation. That means there can be a curve to getting started. But once you are working, the tool is simple and extensible.

Monitoring & Alerting for Windows Defender in Azure VMs

In this post, I will explain how one can monitor Windows Defender and create incidents for it with Azure VMs.

Background

Windows Defender is built into Windows Server 2016 and Windows Server 2019. It’s free and pretty decent. But it surprises me how many of my customers (all) choose Defender over third-parties for their Azure VMs … with no coaching/encouragement from me or my colleagues. There is an integration with the control plane using the antimalwareagent extension. But the level of management is poor-none. There is a Log Analytics solution, but solutions are deprecated and, last time I checked, it required the workspace to be in per-node pricing mode. So I needed something different to operationalise Windows Defender with Azure VMs.

Data

At work, we always deploy the Log Analytics extension with all VMs – along with the antimalware extension and a bunch of others. We also enable data collection in Azure Security Center. We use a single Log Analytics workspace to enable the correlation of data and easy reporting/management.

I recently found out that a table in Log Analytics called ProtectionStatus contains a “heartbeat” record for Windows Defender. Approximately every hour, a record is stored in this table for every VM running Windows Defender. In there, you’ll find some columns such as:

  • DeviceName: The computer name
  • ThreatStatusRank: A code indicating the health of the device according to defender:
    • 150: Health
    • 470: Unknown (no extension/Defender)
    • 350: Quarantined malware
    • 550: Active malware
  • ThreatStatus: A description for the above code
  • ThreatStatusDetails: A longer description
  • And more …

So you can see that you can search this table for malware infection records. First thing, though, is to filter out the machines/records reporting that there is no Defender (Linux machines, for example):

let all_windows_vms = Heartbeat

| where TimeGenerated > now(-7d)

| where OSType == ‘Windows’ | summarize makeset(Resource); ProtectionStatus

| where Resource in (all_windows_vms)

| sort by TimeGenerated desc

The above will find all active Windows VMs that have been reporting to Log Analytics via the extension heartbeat. Then we’ll store that data in a set, and search that set. Now we can extend that search, for example finding all machines with a non-healthy state (150):

let all_windows_vms = Heartbeat

| where TimeGenerated > now(-7d)

| where OSType == ‘Windows’

| summarize makeset(Resource); ProtectionStatus

| where Resource in (all_windows_vms)

| where ThreatStatusRank <> 150

| sort by TimeGenerated desc

Testing

All the tech content here will be useless without data. So you’ll need some data! Search for the Eicar test string/file and start “infecting” machines – be sure to let people know if there are people monitoring the environment first.

Security Center

Security Center will record incidents for you:

You will get email alerts if you have configured notifications in the subscription’s Security Center settings. Make sure the threshold is set to LOW.

If you want an alternative form of alert then you can use a Log Analytics alert (Scheduled Query Alert resource type) based on the below basic query:

SecurityAlert

| where TimeGenerated > now(-5m)

| where VendorName == ‘Microsoft Antimalware’

The above query will search for Windows Defender alerts stored in Log Analytics (by Security Center) in the last 5 minutes. If the threshold is freater than 0 then you can trigger an Azure Monitor Action Group to tell whomever or start whatever task you want.

Workbooks

Armed with the ability to query the ProtectionStatus table, you can create your own visualisations for easy reporting on Windows Defender across many machines.

 

The pie chart is made using this query:

let all_windows_vms = Heartbeat

| where TimeGenerated > now(-7d)

| where OSType == ‘Windows’

| summarize makeset(Resource); ProtectionStatus

| where TimeGenerated > now(-7d)

| where Resource in (all_windows_vms)

| where ThreatStatusRank <> ‘150’

| summarize count(Threat) by Threat

With some reading and practice, you can make a really fancy workbook.

Azure Sentinel

I have enabled the Entity Behavior preview.

Azure Sentinel is supposed to be the central place to monitor all security events, hunt for issues, and where to start investigations – that latter thanks to the new Entity Behavior feature. Azure Sentinel is powered by Log Analytics – if you have data in there then you can query that data, correlate it, and do some clever things.

We have a query that can search for malware incidents reported by Windows Defender. What we will do is create a new Analytic Rule that will run every 5 minutes using 5 minutes of data. If the results exceed 0 (threshold greater than 0) then we will create an incident.

let all_windows_vms = Heartbeat

| where TimeGenerated > now(-7d)

| where OSType == ‘Windows’

| summarize makeset(Resource); ProtectionStatus

| where TimeGenerated > now(-5m)

| where Resource in (all_windows_vms)

| where ThreatStatus <> ‘No threats detected’ or ThreatStatusRank <> ‘150’ or Threat <> ”

| sort by Resource asc | extend HostCustomEntity = Computer

The last line is used to identity an entity. Optionally, we can associate a logic app for an automated response. Once that first malware detection is found:

You can do the usual operational stuff with these incidents. Note that this data is recorded and your effectiveness as a security organisation is visible in the Security Efficiency Workbook in Azure Sentinel – even the watchers are watched! If you open an incident you can click investigate which opens a new Investigation screen that leverages the Entity Behavior data. In my case, the computer is the entity.

The break-out dialogs allow me to query Log Analytics to learn more about the machine and its state at the time and the state of Windows Defender. For example, I can see who was logged into the machine at that time and what processes were running. Pretty nice, eh?

The Office – Construction Complete

The construction of Cloud Mechanix global HQ finished yesterday afternoon. The final piece to go in place was the step up to the door. You can really see the slope in the site in the below photo.


If you step in you can see the all-wood finish, with the 1st fitting electrics, ready for the final touches.

And you can see the view from one of the locations where one of the desks will be located.

We had some paving stones from the site clearance, so they are being repurposed to cross the lawn from the house to the front door. A swing set was placed directly in-front of the office – the replacement is going to the far end of the lawn. The grass underneath and the empty anchor spots are bald so I’ll be re-sowing grass there in the coming days.

So what’s next? Paint and second-fitting electrics are next in the project plan. I will also be looking at how we can extend the house security systems to the office – a combination of a supplier-based monitored alarm system and the Ring cameras. That will allow us to insure the contents of the new office, and then … it’ll be time to move in.

The New Office – Day 3

It’s Monday and day 3 of the construction of Cloud Mechanix Global HQ. On Friday, the carpenters finished the steel roof and installed the wall studs for the cavity wall with insulation, as you can see in the photo.

One of the options we selected in the installation was to have a “plinth” and step installed. The price for this is always estimated on the site – the site effects the levelling and height of the foundation, and the height determines the amount of wood and labour required. The quote I was given was €350. The team is planning to finish the interior and additional wood work today. Next up will be the painters and electrical second fitting. We also added the options for paint and painting in the purchase. I can paint, but I suck at edges and I’m slow; I’d rather let a professional do it plus that means it’s time I can spend doing other things.

Thoughts are now turning to the interior, which will be painted white. Floorboards are being installed and also being painted white. But I am considering installing additional flooring to protect the structural wood. If I do, it’ll be something dark, or even grey, to contrast somewhat with the white walls and ceiling. I already have a white L-shaped Ikea “Bekant” desk. I think I will stick with the Ikea desks – I do like them. The plan is that the L-shaped desks will be placed behind the windows at the front, giving us a nice view up the garden while working. We’ll see where the budget is in a week or so, but I’m partial towards the version of the Bekant that includes powered legs that can raise/lower the desk at the push of a button.

I have an office chair already with a white frame and black cushions. But it was a bad purchase – the piston lasted only a few months and it makes a cracking noise when I adjust my position – that was why I replaced the previous seat! I can see myself buying some Secret Lab chairs so comments on that are welcome – from people who have owned one for more than 1 year.

I have a white 2×4 Ikea “Kallax” unit in the small office in the house at the moment. On the plus side, it’s great to display certain things. On the negative side, the close options are too small for the stuff you want to hide – I have 2 large plastic storage boxes hidden in the built-in wardrobe in the office at the moment, filled with things like spare peripherals, cables, battery packs, and so on. I think I want something that is closed storage at the bottom and open shelf storage at the top. I will keep looking.

My desk is quite full at the moment: 3 x USB drives (that I might replace with a NAS), a laptop, an Intel NUC, a KVM switch, a Surface dock, an Epson eco-tank printer, 2 x 27″ monitors, a mic, and so on. Some of the stuff is stuff that you need around but don’t need beside you all the time. Because we have a nearly 6 x 4m space, we have room to add more desk space. So I am thinking of putting in 1, or even 2, additional straight desks at a later time. Things like the printer and networking gear can be relocated to there.

Finally, there are times where you want to work, but want to relax. Maybe you need to read something in a more relaxed position, or just take a break from the screen. I’m thinking that we’ll put in a black leather couch in middle of the office – I’m not sure where yet because I might also move the 43″ TV from the house office into there – nice for watching conferences, etc. The second-hand market sometimes is great for that kind of thing – my mother-in-law is quite skilled at buying/selling on those kinds of websites so we might have to recruit her assistance. Depending on how that goes and how much space there is, we might add a small coffee table too.

The New Office – Construction Has Begun

It is day 2 of the construction of our new office. The carpenters arrived exactly at 9 am yesterday morning, bang-on time, to begin the assembly and construction of our new garden office – the global headquarters of Cloud Mechanix.

My only concern at the start was whether the site would be level enough for the build. The team warned us early on that the slope would make one side look higher up when it was level but we knew that and were OK with it. They quickly assembled the wooden frame that would eventually site on concrete blocks.

Once the 6x4m frame was built I was called out to pick the final position – allowing just over half a metre from the concrete garden wall so I can paint the outside to maintain the integrity of the wooden construction. Not long after the wooden frame was in position. Around lunchtime, the planks for the outside wall were being hammered into place. I’d say it took the 3 of them 45 minutes to get all the planks in place. Then the ceiling was put on.

The electricians were called in and asked to confirm how we wanted to do things after they surveyed the site. The one thing I was worried about was how the armoured electrical cable would run from the office to the fuse box at the front of the house – we need to power at least 2 computers + equipment and a 1.5 KW heater (please save comments on €13,000 hearing/aircon systems – the office is costing that much!). The ethernet line and electrical cable are entering on the back-right and will run along the garden wall under the capping (which you cannot see in the photos here). Then they will run underground to the front of the house. The ethernet line will enter the house at the same location as the phone/Internet and be just a meter from the broadband router. The electrical cable will cross the front of the house hidden in white trunking (maybe 2 cm wide) attached under the white soffit of the house – nicely camouflaged. The electrical cable will enter above the front door (still camouflaged) and then be under 1 metre from the fusebox – trunking from the front door, along an interior white doorframe – still camouflaged.

The electricians did the first fitting: 5 x dual power sockets, light switch, and 6 x dropdown lights. When leaving, they let me know that they would be back once the interior was finished by the carpenters to do the second fittings – sockets, lights, and the aforementioned cable installations.

The carpenters carried on, adding insulation and sheeting to the roof before the end of the day.

As you can see, there are two big windows, right where the front desks will be – one for me and one for my wife. Cloud Mechanix will have a great view overlooking about 40m of the garden!

Day two (today) began with drizzle. The team (down to two) were bang-on time again. They told me that they expect to require 3 days for the complete assembly and finishing of the wooden structure. Today, the plan was to get the doors and windows done, the studs for the wall insulation & interior installed, and the roof started. As you can see, the doors and windows were in place at lunchtime.

Exterior trims have been added, the slats for the roof (steel roof) are in place, and I can hear steel being cut as I type – they are installing the steel sheet roof that has a black tile effect. Once that is done, the slow bit (as they referred to it) will begin: the interior finishes, the lower trim and the step.

The Office Installation Is About To Start

It’s a while (about 6 weeks) since I wrote that I am getting a new office built in the garden. Today seems like a good time to post an update.

There were 3 birch trees on the site where the office is being installed. I had the trees cut down (we have already replaced them with 4 apple trees and 2 more will be added next year) and that left the stumps. If I had been clever, I would have hired a mini-digger and dug around them. Maybe I would have also hired a tree grinder to chop up the revealed stumps. But no – instead we decided to dig them out by hand. Birch trees have interesting roots. The root ball is about 60-80 CM deep, and roots, sometimes 3x wider (or more) than the trunk, come out and up towards the surface. Those roots break out into a fine mesh that spreads out around the site, holding the soil firmly together, almost like concrete. I had a pick-axe, an axe, a shovel, a trowel, and my bare hands. The pick-axe was literally bouncing off the ground. The roots were like a phone book, and a sharp axe wasn’t cutting; it was breaking a layer or two and bounding too. I’ll keep this short – the final stump took over 8 hours to dig out by myself, and ended with me using my hands to scoop out dirt from under the root ball and roll the stump until the tap-root snapped.

Tip: hire the machinery, and don’t do it by hand.

The next thing to deal with was the base. Once the trees were removed, we realised that the site had a slant that we had never noticed before – about 27cm from left to right over 6m. I thought that would require a concrete base. So I put out some feelers to find a builder that would install the base. I got 4 names. Only 3 returned my contact attempts. Of the 3, only 2 bothered to show up – one of those was 2 days late! And of the 2, neither either bothered to give me a quote. I guess that builders are busy enough, even with all the COVID disruption that’s been going on. I called the cabin builders to get their opinion – and they said that we didn’t need a base. Their method of foundation installation would handle the minor slope. Perfect!

The electrician was the next thing. After my last post, I had lots of people telling me to go spend another 13,000-20,000 Euros on different kinds of heating systems – me arse! We didn’t even spend that on the house! I got an electrician quote – a guy who came recommended from the cabin builders. I’d tried reaching out to one other electrician but he was as useless at responding as the builders. The recommended guy is booked and will be doing the first (armored cable from the fusebox) and second fittings, as well as an Ethernet wire installation from our broadband router in the house to the office.

The last piece of the installation will be painting – that’s paid for with the installation and will be done afterward.

So when will all this happen? A lorry dropped off an “Ikea-like” load in our driveway a couple of hours ago. An installation team is due tomorrow at 9am to start the assembly. Hopefully we won’t have to wait too long for the painters and then we can get the alarm/security stuff installed, a new floor surface installed, and purchase some new office furniture.

Started Building A New Home Office

As I have posted before, I work from home. For a number of reasons, my wife and I have decided that I should move the office out of the house and into our garden. So that means that I need a new building in the garden. I spent months reading over the different options and providers. Eventually, I settled on one provider and option.

I’ve chosen to build a “log cabin”, actually a modern wooden building, in our back garden. I’ve chosen all the upgrades for thicker walls, wall/floor/roof insulation, upgraded roof, guttering & drainage, and so on, making it a warm place to work in the winter. If I had to explain it, I’d say it’s Nordic in design. The building is 6 x 4 meters. I went big because it gives me lots of space (better to have more than needed than retreat a smaller choice later) and gives my wife the option of from home too. We paid the deposit last week and are expecting delivery and installation in 5-7 weeks.

The first challenge was where to put the new office. Many of my American friends won’t think that a 6×4 meter building is big or that space shouldn’t be a problem. In Europe, that is big and space is very much a problem. However, I’m lucky, because our back garden was an accident of poor planning by the builder and ended up being ~65 meters long. We were lucky when we bought the house 4 years ago because the previous owner possibly undervalued the site – he didn’t sell through an estate agent.  We have lots of space, most of it hilly, but we have space. After some discussions, it was decided to put the office near the house at the bottom of the garden for 3 reasons:

  • Networking convenience
  • Power convenience
  • The view

The view question was interesting. From the rear of the house, the office will be mostly out of view, with the living parts of the house still looking out onto the back garden. The office will have windows and glass doors on the front, overlooking the garden. So depending on my seating angle, I will be looking out over the green view of the back garden. But there are a few issues.

The first was that there were three nearly-20-years-old birch trees in the exact corner that the office will be going into. I hated the thought of tearing down those trees. But the corner is not used – the trees create a haven for flies and my kids won’t go in there to play – that was my vision for the area. On the other hand, the left side of our garden is lined with a variety of mature trees, we’ve planted 4 more 18 months ago and they are doing well, and both sides are surrounded by trees outside our border wall. So we called up the local handyman who cut down the trees and removed the cuttings – he uses the wood after drying it out.

That was the easy bit! The tree stumps had to be removed next. You can’t just cut/grind a stump down and hope that’s that. Nature is tough. The roots would live on and a new tree could grow through the floor of the office. You have to cut all the roots from the trunk, get under it and lift it out. So out came shovels, an axe and a pickaxe. After 5 hours, 2 of the 3 stumps were removed on Saturday. That was back-breaking work. It turns out that the birch tree grows roots down deep. Those roots are in tick fibrous branches that quickly break out into a mesh of fibres that spread out 3cm to 30 cm deep, protecting the soil around the tree from tools such as shovels and pickaxes. You have to dig, tear and cut just to get through that first few CM of soil and then you face the roots that an axe will bend, not cut. And when you think you’ve cut the last root that is securing the trunk, you find that there are more. It’s Monday now and I’m facing 1 last stump to remove. It’s only in the last few hours that stiffness from Saturday has set in – so today will be fun!

Cutting down the trees revealed something that we had not noticed. The site where the office will reside – about 7m wide, is not level. There is a ~30cm slope going from left to right, and a lesser uneven slope from front to back. The office must be installed onto a level site. I evaluated the options – I hoped that I could dig out one side and use the soil to level out the other side. But that would mean digging under the boundary wall and weakening the foundations. There is no option other than to build a concrete base or pad. At my wife’s suggestion, I went onto the local community page on Facebook and asked for builder recommendations. I reached out to 4 builders – 2 are coming today to give me a quote and one is to call me about making an appointment. I’ll need the pad installed ASAP – concrete sets quickly but takes weeks to harden.

And finally … there’s the electrical installation – something that I know that I cannot do. The cabin manufacturers recommended an electrician. 10 double sockets, lights, a 1.5KW storage heater and connection to the fuse box will clock in at a sizeable sum of change, plus VAT (sales tax). We’ll try to get some alternative quotes for that next.

Why A Bastion Host Is Necessary For Remote VM Administration (Including Azure)

This post will explain why you should use a “Bastion Host” or a “Jump Box” to securely remote into Linux (SSH) or Windows (Remote Desktop) virtual machines. And this advice also includes machines that you run in a cloud, such as Microsoft Azure.

For the Fundamentalists on Social Media

Some people are going to make some comments like:

“This is why you should use remote Bash|PowerShell scripting”

Or maybe:

“You should be using Windows Admin Center”.

Windows Admin Center – great! Genuinely. But it does not do everything.

There are still many times when you need to directly log into a machine and do something; that’s real life, and not some blogger’s lab life.

Security Center JIT VM Access?

I was a fan of this feature. That was until they changed how the allow (RDP, SSH, etc) rules were added to an NSG. In my work, every subnet is micro-segmented. That means that the last user-defined NSG rule is Deny All from * to *. Since JIT VM Access was changed, it moves the last rule (if necessary) and puts in the allow-RDP or all-SSH (or whatever) rule after the DenyAll rule which is useless. Feedback on this has been ignored.

Why Are SSH and RDP Insecure?

I can’t comment too much on SSH because I’m allergic to penguins. But I can comment on RDP. Over the last few months, I can think of 3 security alerts that have been released about pre-authentication vulnerabilities that have been found in Remote Desktop. What does that mean?

Let’s say that you have a PC on your WAN that is infected by malware that leverages a known or zero-day pre-authentication remote desktop vulnerability. If that PC has the ability to communicate with a remote VM, such as an Azure Windows/Linux VM, via SSH or RDP then that remote machine is vulnerable to a pre-authentication attack. That means that if malware gets onto your network, and that malware scans the network for open TCP 22 or TCP 3389 ports, it will attempt to use the vulnerability to compromise the remote VM. It does not require the user of the PC to SSH or RDP into the remote VM, or to even have any guest OS access! You can put a firewall in front of the remote virtual machines, but it will do no good; it’s still allowing TCP 3389 or TCP 22 directly into the virtual machines and all it will offer is logging of the attack.

A Bastion Host

You might have heard the term “bastion” in the Azure world recently. However, the terms Bastion Host or Jump Box are far from new. They’re an old concept that allows you to isolate valuable machines and services behind a firewall but still have a way to remote into them.

The valuable remote virtual machines are placed behind a firewall. In Azure, that could be a firewall appliance, such as Azure Firewall, and/or Network Security Groups. Now to connect to the remote VMs, you must first remote into the Bastion Host. And from that machine, you will remote further into the network through the isolation of the firewall/NSGs.

But that’s still not perfect, is it? If we do simple SSH or RDP to the Bastion Host, then it is vulnerable to pre-authentication attacks. And that means once that machine is compromised, it can attack further into the remote network. What we need is some kind of transformation.

Remote Desktop Gateway

My preferred solution is to deploy a Remote Desktop Gateway (RDGW) as the bastion host – this does not require RDP licensing for administrative access to the remote virtual machines! The Bastion Host is deployed as one virtual machine or 2+ load-balanced virtual machines that allow in HTTPS connections via firewall/NSG rules. When an administrator/developer/operator needs to log into a remote VM, their Remote Desktop client is configured to connect to this gateway using HTTPS instead of RDP. Once the connection is authenticated by the RDGW, it reverse proxies the connection through to the desired virtual machine, further protected by firewall/NSG rules. Now the malware that is on the WAN cannot probe any machines in the remote network; there is no opening across the network to TCP 3389 or TCP 22. Instead, the only port open for remote connections is HTTPS which requires authentication. And internally, that transforms to connections from the RDGW to the remote VMs via TCP 3389.

Some sharp-eyed observers might notice that the recently announced CVE-2020-0609  is a pre-authentication attack on RDGW! Yes, unpatched RDGW deployments are vulnerable, but they are smaller in number and easier to manage patches for than a larger number of other machines. Best practice for any secure network is to limit all external ports. Transforming the protocol in some way, like an RDGW, further reduces the threat of that single opening to a single service that forwards the connection.

If you want to add bells and whistles, you can deploy Network Policy Server(s) to centrally manage RDGW policy and even add multi-factor authentication (MFA) via Azure AD.

This is great for Windows, but what about Linux? I’m told that Guacamole does a nice job there. However, Guacamole is not suitable for recent releases of Windows because of how it must have hardcoded admin credentials for Network Layer Authentication (NLA).

Azure Bastion

Azure Bastion made lots of noise in IT news sites, and on blogs and social media when it went into preview last year, and eventually it went GA at Ignite in November of last year. Azure Bastion is a platform-based RDGW. Today (January 2020), I find it way too limited to use in anything but the simplest of Azure deployments:

  • The remote desktop authentication/connection are both driven via the Azure Portal, which assumes that the person connecting into the guest OS even has rights to the Azure resources.
  • It does not support desktop Remote Desktop/SSH clients.
  • It does not offer MFA support for the guest OS login, only for the Azure Portal login (see above).
  • VNet peering is not supported, limiting Azure Bastion to pretty simple Virtual Network designs.

If Azure Bastion adds VNet peering, it will make it usable for many more customers. If it understands that guest OS/Azure resource rights OS/Azure Portal logins can be different, then it will be ready for mid-large enterprise.

 

The Worst IT Project I Was Ever A Part Off

This post will discuss a failed project that I was brought into and what I observed and learned from that project. It’s a real scenario that happened years ago, involving an on-premises deployment.

Too Many Chefs Spoil the Broth

Back in 2010, I joined a Dublin-based services company. The stated intention from the MD was that I was to lead a new Microsoft infrastructure consulting team. As it turned out, not a single manager or salesperson in the company believed that there was any work out there in Microsoft infrastructure technology – really! – and it never really got off the ground. But I was brought into one customer, and this post is the story of that engagement.

It was a sunny, cold day when I drove out to the customer’s campus. They are a large state-owned … hmm … transport company. I had enough experience in IT to know that I was going to be dealing with strong personalities and opinions that were not necessarily based on fact. My brief was that I would be attending a meeting with all the participants of a failing Windows Server 2008 R2 Hyper-V and System Center 2008 R2 project. I came in and met a customer representative – the technical lead of the project. He immediately told me that I was to sit in the corner, observe, and not talk to any of the participants from the other services providers. Note that the last word is plural, very plural.

I sat at the far corner of a long board room table and in came everyone. There was the customer IT managers and tech staff, the storage manufacturer (HP – now HPE) and their partner, the networking manufacturer (Cisco) and their partner, a Microsoft Premier Field Engineer, the consultants that implemented Hyper-V, the consultants that implemented the System Center management of Hyper-V, and probably more. Before I continue, I think that the Hyper-V cluster was something like 6 nodes and maybe 50-100 VMs.

Quickly it became evident that this was the first time that any of the participants in the meeting had talked to each other. I should re-phrase that: this was the first time any of the participants in deploying the next-generation IT infrastructure for running the business had been allowed to talk to each other.

  • A new W2008 R2 Hyper-V cluster was built. Although the customer was adamant that this was not true (it was), a 2-site cluster was built as a single-site cluster. There was a latent link between the two sites, with no control of VM placement and no third-site witness.
  • HP P4000 “Lefthand” module-based iSCSI storage was used without any consideration of persistent iSCSI reservations – a common problem in the W2008 R2 era, where the volumes in the SAN would “disappear” from the cluster because the scale-out of NICs/CSVs/SAN nodes went beyond the limits of W2008 R2 – a result of poor understanding of storage performance and Hyper-V architecture.
  • I remember awful problems with backup. DPM was deployed by a consulting firm but configured by a local staff member. He had a nightmare with VSS providers (HP were awful at this) and backup job sizing. It was not helped by the fact that backup was an afterthought in Hyper-V back then, not resolved really until WS2012 when it became software-defined. This combined with how the P4000 worked, the multi-site cluster that wasn’t, and Redirected IO caused all sorts of fun.
  • VMs would disappear – yup the security officer insisted that AV was installed on each host and it scanned every folder, including the CSVs. They even resisted change when presented with the MS documentation on scan exceptions that must be configured on Windows Server roles/features, including Hyper-V.

These were just a few of the technical issues; there were many more – inconsistent or missing patching, NIC teaming issues, and so on. I even created a 2-hour presentation based on this project that I (unofficially) called “How to screw up a Hyper-V project”.

My role was to “observe” but I wanted this thing fixed, so I contributed. I remember I spent a lot of time with the MS PFE on the customer site. He was gathering logs on behalf of support and we shared notes. Together we identified many issues/solutions. I remember one day, the customer lead shouted at me and ordered me back to my desk. I was not there “to talk to people but to observe”. The fact that I was one of two people on site that could solve the issues was lost on him.

The customer’s idea of running a project was to divide it up into little boxes and keep everyone from talking to each other. Part of this was how they funded the project – once it went over a certain monetary level it had to be publicly tendered. They had their preferred vendors and they went with them, even if they were not the best people. This created islands of knowledge/expertise and a lack of a vision. The customer thought they could manage this, and they were wrong. Instead, each supplier/vendor did their own thing based on assumptions of what others were doing and based on incorrect information shared by the customer’s technical team. And it all blew up in the customer’s face.

In the end, I heard that the customer blamed the software, the implementors, and everyone else involved in the project but themselves. They scrapped the lot and went with VMware, allegedly.

Lessons Learned

I think that there were three major lessons to be learned from this project. I know that these lessons apply equally today, no matter what sort of IT project you are doing, including on-premises, hybrid, or pure cloud.

The Business

IT enables or breaks the business. That’s something that most boards/owners do not understand. They think of IT as the nerds playing Doom in a basement, with their flashing lights and whirring toys. Obviously, that’s a wrong opinion.

When IT works, it can make the IT faster, more agile, and more competitive. New practices, be they operational or planning, can change IT, but I’ve even read how SCRUM/Agile concepts can even be brought to business planning.

Any significant IT project that will impact the business must start with the business. Someone at the C-Level must own it, be invested in it, and provide the rails or mission statement that directs it. That oversight will force those involved in the project to operate correctly and give them guidance on how to best serve the business.

Architecture

Taking some large-impact IT project and treating it as a point solution will not work. For example, building an entirely new IT infrastructure without considering the impact of or the dependencies on networking is stupid! You cannot just hand-off systems to different vendors and wish them bon voyage. There must be a unified vision. This technical vision starts with the previously mentioned business vision that guide-rails the technical design. All components that interconnect and have direct/indirect involvements must be designed as a whole.

Unified Delivery

The worst thing one can do is divvy up IT infrastructure to 5 or 6 vendors and say, you do that, and I will participate in a monthly meeting. That’s not IT! That’s bailing out on your responsibility! IT vendors can play a role, when chosen well. But they need a complete vision to do their job. And if they cannot get that from you, they must be allowed to help you build it. If your IT department’s role is to manage outsourcing contracts and nothing more, you have already failed the business and should just step aside.

A unified delivery must start with internal guidance, sharing the complete vision with all included parties, internal and external, as early as possible. Revealing significant change that you are working on with Vendor A 6 months into a project with Vendor B is a fail. Isolating each of the vendors is a fail. Not giving each vendor clear rules of engagement with orchestrated interaction is a fail. The delivery must be unified under the guidance of the architect who has a complete vision.

Bad IT Starts at The Top

In my years, I’ve done plenty of projects, reviewed many customer’s IT systems, and worked as a part of IT departments. Some of them were completely shocking. A common theme was the CIO/CTO: typically, an accountant or finance officer who was handed the role of supervising IT because … well … it’s just IT and they have a budget to manage. Someone who doesn’t understand IT, hires/keeps bad IT managers, and bad IT managers hire bad IT staff, make bad IT decisions, and run bad IT projects. As the saying goes, sh&t rolls downhill. When these bad projects are happening to you, and you run IT, then you must look at the mirror and stop pointing the finger elsewhere.

And before you say it, yes, there are crap consultants too 😊