Building Azure VM Images Using Packer & Azure Files

In this post, I will explain how I am using a freeware package called Packer to create SYSPREPed/generalised templates for Citrix Cloud / Windows Virtual Desktop (WVD) – including installing application/software packages from Azure Files.

My Requirement

Sometimes you need an image that you can quickly deploy. Maybe it’s for a scaled-out or highly-available VM-based application. Maybe it’s for a Citrix/Windows Virtual Desktop worker pool. You just need a golden image that you will update frequently (such as for Windows Updates) and be able to bring online quickly.

One approach is to deploy a Marketplace image into your application and then use some deployment engine to install the software. That might work in some scenarios, but not well (or at all) in WVD or Citrix Cloud scenarios.

A different, and more classic approach, is to build a golden image that has everything installed and then the  VM is generalised to create an image file. That image file can be used to create new VMs – this is what Citrix Cloud requires.

Options

You can use classic OS deployment tools as a part of the solution. Some of us will find familiarty in these tools but:

  • Don’t waste your time with staff under the age of 40
  • These tools aren’t meant for the cloud – you’ll have to daisy chain lots of moving parts, and that means complex failure/troubleshooting.

Maybe you read about Azure Image Builder? Surely, using a native image building service is the way to go? Unfortunately: no. AIB is a preview, driven by scripting, and it fails by being too complex. But if you dig into AIB, you’ll learn that it is based on a tool called Packer.

Packer

Packer, a free tool from Hashicorp, the people behind Terraform, is a simple command line tool that will allow you to build VM images on a number of platforms, including Azure ARM. The process is simple:

  • You build a JSON file that describes the image building process.
  • You run packer.exe to ingest that JSON file and it builds the image for you on your platform of choice.

And that’s it! You can keep it simple and run Packer on a PC or a VM. You can go crazy and build a DevOps routine around Packer.

Terminology

There are some terms you will want to know:

  • Builders: These are the types of builds that Packer can do – the platforms that it can build on. Azure ARM is the one I have used, but there’s a more complex/faster Builder for Azure called chroot that uses an existing build VM to build directly into a managed disk. Azure ARM builds a temporary VM, configures the OS, generalises it, and converts it into an image.
  • Provisioners: These are steps in the build process that are used to customise your operating system. In the Windows world, you are going to use the PowerShell provisioner a lot. You’ll find other built in provisioners for Ansible, Puppet, Chef, Windows Restart and more.
  • Custom/Community Provisioners: You can build additional provisioners. There is even a community of provisioners.

Accursed Examples

If you search for Windows Packer JSON Files, you are going to find the same file over and over. I did. Blog posts, powerpoints, training materials, community events – they all used the same example: Deploy Windows, install IIS, capture an image. Seriously, who is ever going to want an image that is that simple?

My Requirement

I wanted to build a golden image, a template, for a Citrix worker pool, running in Azure and managed by Citrix Cloud. The build needs to be monthly, receiving the latest Windows Updates and application upgrades. The solution should be independent of the network and not require any file servers.

Azure Files

The last point is easy to deal with: I put the application packages into Azure Files. Each installation is wrapped in a simple PowerShell script. That means I can enable a PowerShell provisioner to run multiple scripts:

      “type”: “powershell”,
      “scripts”: [
        “install-adobeReader.ps1
        “install-office365ProPlus.ps1”
      ]
This example requires that the two scripts listed in the array are in the same folder as packer.exe. Each script is run in turn, sequentially.

Unverified Executables

But what if one of those scripts, like Office, wants to run a .exe file from Azure Files? You will find that the script will stall while a dialog “appears” (to no one) on the build VM stating that “we can’t verify this file” and waits for a human (that will never see the dialog) to confirm execution. One might think “run unlock-file” but that will not work with Azure Files. We need to update HKEY_CURRENT_USER (which will be erased by SYSPREP) to truse EXE files from the FQDN of the Azure Fils share. There are two steps to this, which we solve by running another PowerShell provisioner:
    {
      “type”: “powershell”,
      “scripts”: [
        “permit-drive.ps1”
      ]
    },
That script will run two pieces of code. The first will add the FQDN of the Azure Files share to Trusted Sites in Internet Options:

set-location “HKCU:\Software\Microsoft\Windows\CurrentVersion\Internet Settings\ZoneMap\Domains”
new-item “windows.net”
set-location “Windows.net”
new-item “myshare.file.core”
set-location “myshare.file.core”
new-itemproperty . -Name https -Value 2 -Type DWORD

The second piece of code will trust .EXE files:

set-location “HKCU:\Software\Microsoft\Windows\CurrentVersion\Policies”
new-item “Associations”
set-location “Associations”
new-itemproperty . -Name LowRiskFileTypes -Value ‘.exe’ -Type STRING

SYSPREP Stalls

This one wrecked my head. I used an inline PowerShell provisioner to add Windows roles & features:

      “type”: “powershell”,
      “inline”: [
        “while ((Get-Service RdAgent).Status -ne ‘Running’) { Start-Sleep -s 5 }”,
        “while ((Get-Service WindowsAzureGuestAgent).Status -ne ‘Running’) { Start-Sleep -s 5 }”,
        “Install-WindowsFeature -Name Server-Media-Foundation,Remote-Assistance,RDS-RD-Server -IncludeAllSubFeature”
      ]
But then the Sysprep task at the end of the JSON file stalled. Later I realised that I should have done a reboot after my roles/features add. And for safe measure, I also put one in before the Sysprep:
    {
      “type”: “windows-restart”
    },
You might want to run Windows Update – I’d recommend it at the start (to patch the OS) and at the end (to patch Microsoft software and catch any missing OS updates). Grab a copy of the community Windows-Update provisioner and place it in the same folder as Packer.exe. Then add this provisioner to your JSON – I like how you can prevent certain updates with the query:
    {
      “type”: “windows-update”,
      “search_criteria”: “IsInstalled=0”,
      “filters”: [
        “exclude:$_.Title -like ‘*Preview*'”,
        “include:$true”
      ]
    },

Summary

Why I like Packer is that it is simple. You don’t need to be a genius to make it work. What I don’t like is the lack of original documentation. That means there can be a curve to getting started. But once you are working, the tool is simple and extensible.

Azure Bastion For Secure SSH/RDP in Preview

Microsoft has announced a new preview of a platform-based jumpbox called Azure Bastion for providing secure RDP or SSH connections to virtual machines running or hosted in Azure.

Secure Remote Connections

Most people that are using The Cloud are using virtual machines, and one of the great challenges for them is secure remote access. You need RDP or SSH to be able to run these machines in the real world.

Remember: for 99.9% of customers, servers are not cattle, they are sacred cows.

Just opening up RDP or SSH straight through a public IP address is bad – hopefully you have an NSG in place, but even that’s bad. If you enable Standard Tier Security Center, the alerts will let you know how bad pretty quickly. And if the recent scare about the RDP vulnerability didn’t wake you up to this, then maybe you deserve to have someone else’s bot farm or a bitcoin mine running in your network.

There are ways that you can secure things, but they all have the pluses and minuses.

VPN

The real reason that we have point-to-site VPN in Azure virtual network gateway was as an admin entry point to the virtual network.

The clue is in the maximum number of simultaneous connections which is 128, way too low to consider as an end user solution for a Fortune 1000, who Microsoft really do their planning for.

If you have supported end user VPN then you know that it’s right up there with password resets for helpdesk ticket numbers, even with IT people like developers. Don’t go here – it won’t end well.

Just-in-Time VM Access

JIT VM Access is a feature of Security Center Standard Tier. It modifies your NSG rules to deny managed protocols such as RDP/SSH (the deny rules are stupidly made as low priority so they don’t override any allow rules!).

When you need to remote onto a VM, an NSG rule is added for a managed amount of time to allow remote access via the selected protocol from a specific source IP address.

So, if it’s all set up right, you deny remote access to virtual machines most of the time. But you will open direct access. And the way JIT VM Access manages the rules now is wonky, so I would not trust it.

An RDP Jumpbox

This is an old method – a single virtual machine, or maybe a few of them, are made available for direct access. They are isolated into a dedicated subnet. You remote into a jumpbox, and from there, you remote into one of your application/data virtual machines.

Unfortunately, it’s still straight RDP/SSH into a machine that is directly accessible on the Internet. So in the remoting protocol vulnerability scenario, you are still vulnerable at the application layer. You could combine JIT VM Access, but now normal daily operations are going to be a drag and I guarantee you that people will invest time to undermine network security. Also, you are limited to 2 RDS connections per jumpbox without investing in a larger RDS (machines + licensing) solution.

Guacamole

This one is relatively new to me. At first it looked awesome. It’s a HTTPS-based service that allows you to proxy into Linux or Windows virtual machines via RDP or SSH.

All looked good until you started running Windows Server 2016 or later in your virtual machines and you needed NLA for secure connections via RDP. Then it all fell apart. The solution requires you to either disable NLA in the guest OS (boo!) or to hard code a username/password with local logon rights for your guest OS’s into the Guacamole server (double-boo!).

Azure Bastion

In case you don’t know this, a bastion host is another name for a jumpbox – an isolated machine that you bounce through. In this case, Bastion is a service that is accessible via the Azure Portal. You sign into the portal, click Connect and use the Bastion service to connect to a Linux or Windows virtual machine via SSH/RDP in the Portal. The virtual machine does not require a public IP address or a “NAT rule”, but it’s still SSH/RDP.

Azure Bastion

On the downside:

  • There’s no multi-factor authentication (MFA)
  • It requires that you sign into the Azure Portal – many people running in the guest OS might not even have those rights!
  • VNet peering is not supported – so larger enterprises are ruled out here … no one in their right mind will deploy 500 bastion hosts (one per VNet) in a large enterprise.

Microsoft did say that these things will be worked on, but when? After GA, which based on the time of year I guess will be just before/after Ignite in early November?

In my opinion, Bastion is the right idea, but more of the backlog should have been included in the minimal viable product.

A Gateway to a Better Solution

If you are a Citrix or a RDS person then you’ve been screaming for the last 5 minutes. Because you’ve been using something for years that most people still don’t know is possible. Both Citrix and RDS have the concept of an SSL gateway.

In the case of RDS, we can deploy one or more (load balanced) Windows Server virtual machines with the RDS Gateway role. If we combine that with NPS and Azure AD, we can also add MFA. With a simple tweak to the Remote Desktop Connection client (MSTSC.EXE), we can RDP to a Windows machine behind the RDS Gateway. The connection from the client to the gateway is pre-authenticated, x.509 certificate protected, HTTPS traffic encapsulating the RDP stream. That connection terminates at the RDS Gateway and then forwards as RDS to the desired Windows Server virtual machine behind it.

Unlike the previous jumpbox solution:

  • This can be a low-end machine, such as a B-Series.
  • It can scale out using a load balancer
  • Many people can relay through a single jumpbox machine.
  • You won’t need RDS licensing at all, not even to scale out to more than 2 users per gateway machine.

So – there’s no SSH here. So Linux is a problem.

Opinion

We don’t really have a complete solution right now. Azure Bastion probably will be the best one in the long-run, but it has so many missing features that I couldn’t consider it now. For Windows, an RDS Gateway is probably best, and for Linux, a Guacamole server might be best.

What do you think?