This post is my set of notes from the Azure Backup session recording (original here) from Microsoft Ignite 2016. The presenters were:
- Dean Wells, Principal Program Manager, Microsoft
- Terry Storey, Enterprise Technologist, Dell
- Kenny Lowe, Head of Emerging Technologies, Brightsolid
This is a “how to” presentation, apparently. It actually turned out to be high level information, instead of a Level 300 session, with about 30 minutes of advertising in it. There was some good information (some nice insider stuff by Dean), but it wasn’t a Level 300 or “how to” session.
When The Heck Is A Shielded VM?
A new tech to protect VMs from the infrastructure and administrators. Maybe there’s a rogue admin, or maybe an admin has had their credentials compromised by malware. And a rogue admin can easily copy/mount VM disks.
- Virtual TPM & BitLocker: The customer/tenant can encrypt the disks of a VM, and the key is secured in a virtual TPM. The host admin has no access/control. This prevents non-customers from mounting a VHD/X. Optionally, we can secure the VM RAM while running or migrating.
- Host Guardian Service: The HGS is a small dedicated cluster/domain that controls which hosts a VM can run on. A small subset of trusted admins run the HGS. This prevents anyone from trying to run a VM on a non-authorized host.
- Trusted architecture: The host architecture is secure and trusted. UEFI is required for secure boot.
Shielded VM Requirements
WS2016 Datacenter edition hosts only. A host must be trusted to get the OK from the HGS to start a shielded VM.
The Host Guardian Service (HGS)
A HA service that runs, ideally, in a 3-node cluster – this is not a solution for a small business! In production, this should use a HSM to store secrets. For PoC or demo/testing, you can run an “admin trusted” model without a HSM. The HGS gives keys to known/trusted/healthy hosts for starting shielded VMs.
Two Types of Shielding
- Shielded: Fully protected. The VM is a complete black box to the admin unless the tenant gives the admin guest credentials for remote desktop/SSH.
- Encryption Supported: Some level of protection – it does allow Hyper-V Console and PowerShell Direct.
- Deploy & manage the HGS and the solution using SCVMM 2016 – You can build/manage HGS using PowerShell. OpenStack supports shielded virtual machines.
- Azure Pack can be used.
- Active Directory is not required, but you can use it – required for some configurations.
Kenny (a customer) takes over. He talks for 10 minutes about his company. Terry (Dell) takes over – this is a 9 minute long Dell advert. Back to Kenny again.
Changes to Backup
The infrastructure admins cannot do guest-level backups – they can only backup VMs – and they cannot restore files from those backed up VMs. If you need file/application level backup, then the tenant/customer needs to deploy backup in the guest OS. IMO, a secure cloud-based backup solution with cloud-based management would be ideal – this backup should be to another cloud because backing up to the local cloud makes no sense in this scenario where we don’t trust the local cloud admins.
This is a critical piece infrastructure – Kenny runs it on a 4-node stretch cluster. If your hosting cloud grows, re-evaluate the scale of your HGS.
Dean kicks in here: There isn’t that much traffic going on, but that all depends on your host numbers:
- A host goes through attestation when it starts to verify health. That health certificate lasts for 8 hours.
- The host presents the health cert to the HGS when it needs a key to start a shielded VM.
- Live Migration will require the destination host to present it’s health cert to the HGS to get a key for an incoming shielded VM.
MSFT doesn’t have at-scale production numbers for HGS (few have deployed HGS in production at this time) but he thinks a 3 node cluster (I guess 3 to still have HA during a maintenance cycle – this is a critical infrastructure) will struggle at scale.
Back to Kenny. You can deploy the HGS into an existing domain or a new one. It needs to be a highly trusted and secured domain, with very little admin access. Best practice: you deploy the HGS into it’s own tiny forest, with very few admins. I like that Kenny did this on a stretch cluster – it’s a critical resource.
Get-HGSTrace is a handy cmdlet to run during deployment to help you troubleshoot the deployment.
Disable SMB1 in the HGS infrastructure.
Very good points here. The customer won’t understand the implications of the security you are giving them.
- BitLocker: They need to protect the key (cloud admin cannot) – consider MBAM.
- Backup: The cloud admin cannot/should not backup files/databases/etc from the guest OS. The customer should back to elsewhere if they want this level of granularity.
Concept here is that you don’t throw away a “broken” fully shielded VM. Instead, you move the VM into another shielded VM (owned by the customer) that is running nested Hyper-V, reduce the shielding to encryption supported, console into the VM and do your work.
Dean: There are a series of scripts. The owner key of the VM (which only the customer has) is the only thing that can be used to reduce the shielding level of the VM. Otherwise, you download the shielding policy, use the key (on premises) to reduce the shielding, and upload/apply it to the VM.
Dean: Microsoft is working on adding support for shielded VMs to Azure.
There’s a video to advertise Kenny’s company. Terry from Dell does another 10 minutes of advertising.
Back to Dean to summarize and wrap up.