How Much Memory Does My Hyper-V Host Require?

If you are trying to figure out how much RAM you have left for virtual machines then this is the post for you.

When Microsoft launched Dynamic Memory with W2008 R2 SP1, we were introduced to the concept of a host reserve (nothing to do with the SCVMM concept); the hypervisor would keep so much memory for the Management OS, and everything else was fair game for the VMs. The host reserve back then was a configurable entry in the registry (MemoryReserve [DWORD] in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization). Things changed with WS2012 when we were told that Hyper-V would look after the reserve and we should stay away from it. That means we don’t know how much memory is left for VMs. I could guess it roughly but I had no hard facts.

And then I saw a KB article from about a month ago that deals with a scenario where it appears that a host has free memory but VMs still cannot start.

There’s two interesting piece of information in that post. The first is how to check how much RAM is actually available for VMs. Do not use Task Manager or other similar metrics. Instead, use PerfMon and check Hyper-V Dynamic Memory Balancer\Available Memory (instance: System Balancer). This metric shows how much memory is available for starting virtual machines.

The second fact is is the size of the host reserve, which is based on the amount of physical RAM in the host. The following table is an approximation of results of the algorithm:

image

Microsoft goes on to give an example. You have a host with 16 GB RAM:

  • The Mangement OS uses 2 GB.
  • The host reserve is up to 2.5 GB.
  • That leaves you with 11.5 GB RAM for VMs.

So think about it:

  1. You log into the host with 16 GB RAM, and fire up Task Manger.
  2. There you see maybe 13.5 GB RAM free.
  3. You create a VM with 13 GB RAM, but it won’t start because the Management OS uses 2 GB and the host reserve is between 2-2.5 GB, leaving you with 11.5-12GB RAM for VMs.

Configuring Windows Server Containers To Use DHCP Instead Of NAT

Read on if you want to learn how to connect Windows Server containers to an external virtual switch so that you don’t use NAT, and the containers actually talk directly to the LAN via DHCP assigned addresses. You’ll also see why a DHCP enabled container fails to get and address and ends up with a 169.254.x.x APIPA IPv4 configuration.

If you use Microsoft’s setup scripts for Windows Server 2016 (WS2016) Technical Preview 3 (TPv3), the default configuration for container networking is that each VM host will have virtual switch (in the VM), connected the VM’s vNIC. The virtual switch works in NAT mode, and uses a private network range to dynamically address containers that connect to the virtual switch. This set up requires each container to have NAT rules on the VM host so that external clients can connect to the services running in the containers. That … could be messy. In some terms, it could allow for huge network scalability (with tens of thousands of possible ports per VM host) but in others, it could be a nightmare to orchestrate.

What if you wanted your containers to talk directly on the LAN. In other words: no NAT. Yes, your containers can do this, and it’s known as a DHCP configuration – your containers are stateless so it’s pointless assigning them static IP addresses; instead the containers will get their addressing from DHCP services on the LAN.

Remember that there are two scripts that we can run to set up a VM host.

  • Method 1: You download New-ContainerHost.ps1 and run it. This downloads a bunch of stuff, creates a VM host, and then runs Install-ContainerHost.ps1. By default, this will configure the VM host with NAT networking.
  • Method 2: You create your own VM, download and run Install-ContainerHost.ps1. By default, you’ll get NAT networking.

But …

Install-ContainerHost.ps1 includes the option for a flag:

image

If you use method 2 then you could run Install-ContainerHost in the new VM host with the -UseDHCP flag set to $true; the behaviour of the script will change. By default it creates the VM host’s virtual switch in NAT mode. But enabling this flag creates an external virtual switch.

In my lab, I like to create my VM hosts using New-ContainerHost because it’s very quick (thanks to the use of differencing disks) and automates the entire setup. But New-ContainerHost doesn’t include the option for UseDHCP. You could edit any call of Install-ContainerHost from New-ContainerHost, but I do it another way.

Instead I edit Install-ContainerHost. One small change will do the trick. Not far from the top is where the parameters are set as script variables. Look for a line that reads:

$UseDHCP,

Modify this line so it reads:

$UseDHCP = $true,

image

Now every time I either run Install-ContainerHost or New-ContainerHost I’ll get the DHCP networking configuration instead of NATing.

So try this to create/configure a VM host, create a container, use Enter-PSSession to connect to the container, run IPConfig and … viola, you’ll have no DHCP address. Say what?

I was stumped. I tried it again. Nothing. I asked for help and by the time I got home, I got a tip from one of the folks in Redmond. It proved to be my “I’m a moron” moment of the day. If I’d thought about it, DHCP is all about broadcasts and MAC addresses. I have a single VLAN set up in the lab so broadcasts wasn’t the issue. What’s going on with MACs? A VM host has a MAC for itself. And then each container on the VM host that connects to the virtual switch has it’s own MAC address … but the network sees only one interface. Have you figured it out yet?

By default, Hyper-V has MAC spoofing disabled on every virtual NIC – a virtual NIC can only have 1 MAC address. What I needed to do was, at the host level, run the following to enable MAC spoofing on the VM host’s virtual NIC:

Get-VMNetworkAdapter -VMName containers3 | Set-vmNetworkAdapter -MacAddressSpoofing On

Now everything works Smile

Windows Server Containers – “Enter-PSSession : The term ‘Measure-Object’ Is Not Recognized”

If you’ve been working with Windows Server Containers in Windows Server 2016 (WS2016) Technical Preview 3 (TPv3) then you’ve probably experienced something like this:

  1. You create a new container
  2. Then start the container
  3. And try to create a PowerShell session into the container using Enter-PSSession

And then there’s lots of red on the screen:

enter-pssession : The term ‘Measure-Object’ is not recognized as the name of a cmdlet, function, script file, or
operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try
again.
At line:1 char:1
+ enter-pssession -ContainerId $container.ContainerId
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : ObjectNotFound: (Measure-Object:String) [Enter-PSSession], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException

image

Strangely, I have not been able to recreate this with Invoke-Command, so it appears to be unique to how Enter-PSSession sets up session in the container.

So how do you solve the issue? It’s simple – you rushed from starting your container to trying to log into it. Wait a few seconds and then try again.

Logging Into Windows Server Containers

How do you log into a container to install software? Ah … you don’t actually log into a container because a container is not a virtual machine. Confusing? Slightly!

What you actually do is remotely execute commands inside of a container; this is actually something like PowerShell Direct, a new feature in Windows Server 2016 (WS2016).

There are two ways to run commands inside of a container.

Which Container?

In Technical Preview 3 (TPv3), the methods we will use to execute commands inside of a container don’t use the name of the container; instead they use a unique container ID. This is because containers can have duplicate names – I really don’t like that!

So, if you want to know which container you’re targeting then do something along the lines of the following to store the container ID. The first creates a new container and stores the resulting container’s metadata in a variable object called $container.

$Container = New-Container -Name TestContainer -ContainerImageName WindowsServerCore

Note that I didn’t connect this container to a virtual switch!

The following example retrieves a container, assuming that it has a unique name.

$Container = Get-Container TestContainer

Invoke-Command

If you want to fire a single command into a container then Invoke-Command is the cmdlet to use. This method sends a single instruction into a virtual machine. This can be a command or a script block. Here’s a script block example:

Invoke-Command -ContainerID $Container.ContainerId -RunAsAdministrator -ScriptBlock { New-Item -Path C:\RemoteTest -ItemType Directory }

Note how I’m using the ContainerID attribute of $Container to identify the container.

The nice thing about Invoke-Command is that it is not interactive; the command remotely runs the script block without an interactive login. That makes Invoke-Command perfect for scripting; you write a script that deploys a container, starts it, does some stuff inside of the container, and then configures networking in the VM host. Lots of nice automation, there!

Enter-PSSession

If you want an interactive session with a container then Enter-PSSession is the way to go. Using this cmdlet you get a PowerShell session in the container where you can run commands and see the results. This is great for once-off stuff and troubleshooting, but it’s no good for automation/scripting.

Enter-PSSession -ContainerID $Container.ContainerId –RunAsAdministrator

Warning – In TPv3 we’ve seen that rushing into running this cmdlet after creating your new container can lead to an error. Wait a few seconds before trying to connect to the VM.

No Network Required!

These methods are using something PowerShell Direct, a new feature in WS2016 – it’s actually PowerShell via a named pipe. The above example deliberately created a VM that has no networking. I can still run commands inside of the container or get an interactive PowerShell session inside of the container without connectivity – I just need to be able to get onto the VM host.

Creating & Deploying Windows Server Containers Using NAT and PowerShell

This post will show you how to use PowerShell to deploy Windows Server Containers using Windows Server 2016 (WS2016) Technical Preview 3 (TPv3).

Note: I wanted to show you how to deploy IIS, but I found that IIS would only work on my first container, and fail on the others.

This example will deploy multiple containers running nginx web server on the same VM host. NAT will be used to network the VMs using a private IP range on the VM host’s internal virtual switch.

Note: The VM host is created at this point, with a working NATing virtual switch that has an IP range of 192.168.250.0/24, with 192.168.250.1 assigned to the VM host.

Create the nginx Container Image

The beauty of containers is that you create a set of reusable container images that have a parent child relationship. The images are stored in a flat file repository.

Note: In TPv3, the repository is local on the VM host. Microsoft will add a shared repository feature in later releases of WS2016.

Log into the VM host (which runs Server Core) and launch PowerShell

PowerShell

In this example I will create a new container using the default WindowsServerCore container OS image. Note that I capture the instance of the new container in $Container; this allows me to easily reference the container and it’s attributes in later cmdlets:

$Container = New-Container -Name nginx -ContainerImageName WindowsServerCore -SwitchName "Virtual Switch"

The container is linked to the virtual switch in the VM host called “Virtual Switch”. This virtual switch is associated with the VM’s sole virtual NIC, and sharing is enabled to allow the VM to also have network connectivity. The switch is enabled for NATing, meaning that containers that connect to the switch will have an IP of 192.168.250.x (in my setup). More on this stuff later.

Start the new container:

Start-Container $Container

Wait 30 seconds for the container to boot up and then remote into it:

Enter-PSSession -ContainerId $Container.ContainerId -RunAsAdministrator

I would normally use IIS here, but I had trouble with IIS in Windows Server Containers (TPv3). So instead, I’m going to deploy nginx web server. Run the following to download the installer (zip file):

WGet -Uri 'http://nginx.org/download/nginx-1.9.3.zip' -OutFile "c:\nginx-1.9.3.zip"

The next command will expand the zip file to  c:\nginx-1.9.3\

Expand-Archive -Path C:\nginx-1.9.3.zip -DestinationPath c:\ -Force

There isn’t really an installer. nginx exists as an executable that can be run, which you’ll see later. The service “install” is done, so now we’ll exit from the remote session:

Exit

We now have a golden container that we want to capture. To do this, we must first shut down the container:

Stop-Container $Container

Now we create a new reusable container image called nginx:

New-ContainerImage -Container $Container -Publisher AFinn -Name nginx -Version 1.0

The process only captures the differences between the original container (created from the WindowsServerCore container OS image) and where the machine is now. The new container image will be linked to the image that created the container. So, if I create a container image called nginx, it will have a parent of WindowsServerCore.

image

I’m done with the nginx container so I’ll remove it:

Remove-Container $Container –Force

Deploying A Service Using A Container

The beauty of containers is how quick it is to deploy a new service. We can deploy a new nginx web server by simply deploying a new container from the nginx container image. All dependencies, WindowsServerCore in this case, will also be automatically deployed in the container.

Actually, “deploy” is the wrong word. In fact, a link is created to the images in the repository. Changes are saved with the container. So, if I was to add content to a new nginx container, the container will contain the web content, and use the service and OS data from the nginx container image in the repository, and OS stuff from the VM host and the container OS image in the repository.

Let’s deploy a a new container with nginx. Once again I will store the resulting object in a variable for later use:

$Web2 = New-Container -Name Web2 -ContainerImageName nginx -SwitchName "Virtual Switch"

Then we start the container:

Start-Container $Web2

Wait 30 seconds before you attempt to remote into the container:

Enter-PSSession -ContainerId $Web2.ContainerId –RunAsAdministrator

Now I browse into the extracted nginx folder:

cd c:\nginx-1.9.3\

And then I start up the web service:

start nginx

Yes, I could have figured out how to autostart ngnix in the original template container. Let’s move on …

I want to confirm that nginx is running, so I check what ports are listening using:

NetStat –AN

I then retrieve the IP of the container:

IPConfig

Remember that the container lives in the NAT network of the virtual switch. In my lab, the LAN is 172.16.0.0/16. My VM host has 192.168.250.0/24 configured (Install-ContainerHost.ps1) as the NAT range. In this case, the new container, Web2 has an IP of 192.168.250.2.

I then exit the remote session:

Exit

There’s two steps left to allow HTTP traffic to the web service in the container. First, we need to create a NAT rule. The container will communicate on the LAN via the IP of the VM host. We need to create a rule that says that any TCP traffic on a select port (TCP 82 here) will be forwarded to TCP 80 of the container (192.168.250.2). Run this on the VM host:

Add-NetNatStaticMapping -NatName "ContainerNat" -Protocol TCP -ExternalIPAddress 0.0.0.0 -InternalIPAddress 192.168.250.2 -InternalPort 80 -ExternalPort 82

Finally, I need to create a firewall rule in the VM host to allow inbound TCP 82 traffic:

New-NetFirewallRule -Name "TCP82" -DisplayName "HTTP on TCP/82" -Protocol tcp -LocalPort 82 -Action Allow -Enabled True

Now if I open up a browser on the LAN, I should be able to browse to the web service in the container. My VM host has an IP of 172.16.250.27 so I browse to http://172.16.250.27:82/ and the default nginx page appears.

Deploy More of the Service

OK, we got one web server up. The beauty of containers is that you can quickly deploy lots of identical services. Let’s do that again. The next snippet of code will deploy an additional nginx container, start it, wait 30 seconds, and then log into it via session remoting:

$Web3 = New-Container -Name Web3 -ContainerImageName nginx -SwitchName "Virtual Switch"

Start-Container $Web3

Sleep 30

Enter-PSSession -ContainerId $Web3.ContainerId -RunAsAdministrator

I then start nginx, verify that it’s running, and get the NAT IP of the container (192.168.250.3).

cd c:\nginx-1.9.3\

start nginx

NetStat -AN

IPconfig

exit

Now I can create a NAT mapping for the container in the networking of the VM host. In this case we will forward traffic to TCP 83 to 192.168.250.4 (the container):

Add-NetNatStaticMapping -NatName "ContainerNat" -Protocol TCP -ExternalIPAddress 0.0.0.0 -InternalIPAddress 192.168.250.3 -InternalPort 80 -ExternalPort 83

And then we open up a firewall rule on the VM host to allow inbound traffic on TCP 83:

New-NetFirewallRule -Name "TCP83" -DisplayName "HTTP on TCP/83" -Protocol tcp -LocalPort 83 -Action Allow -Enabled True

Now I can browse to identical but independent nginx web services on container 1 on http://172.16.250.27:82/ and http://172.16.250.27:83/, all accomplished with very little work and a tiny footprint. One might be production. One might be test. I could fire up another for development. And there’s nothing stopping me firing up more to troubleshoot, branch code, and test upgrades, and more, getting a quick and identical deployment every time that I can dump in seconds:

Remove-Container $Web2, $Web3

If you have apps that are suitable (stateless and no AD requirement) then containers could be very cool.

“Install-WindowsFeature : An unexpected error has occurred” Error When You Run Install-WindowsFeature In A Windows Server Container

This is one of those issues that makes me question a lot of step-by-step blog posts on Windows Server Containers that are out there – plenty of people were quick to publish guides on containers and didn’t mention encountering this issue which I always encounter; I suspect that there’s a lot of copy/pasting from Microsoft sites with little actual testing in a rush to be first to publish. It’s clear that many bloggers didn’t try to install things in a container that required administrator rights, because UAC was blocking those actions. In my case, it was installing IIS in a Windows Server 2016 (WS2016) Technical Preview 3 (TPv3) container.

In my lab, I created a new container and then logged in using the following (I had already populated $Container by  returning the container object into the variable):

Enter-PSSession -ContainerId $Container.ContainerId -RunAsAdministrator

And then I tried to install some role/feature, such as IIS using Install-WindowsFeature:

Install-WindowsFeature -Name Web-Server

I logged in using -RunAsAdministrator so I should have no issues with UAC, right? Wrong! Because the installation fails as follows:

Install-WindowsFeature : An unexpected error has occurred. The system cannot find the file specified.  Error: 0x80070002
+ CategoryInfo          : InvalidResult: (@{Vhd=; Credent…Name=localhost}:PSObject) [Install-WindowsFeature], Exception
+ FullyQualifiedErrorId : RegistryKey_OpenSubKey_Failed,Microsoft.Windows.ServerManager.Commands.AddWindowsFeatureCommand

image

What’s the solution? When you are remoted into the container you need to raise your administrator privileges to counter UAC. You can do this as follows, after you log into the container:

Start-Process Powershell.exe -Verb runAs

Run Install-WindowsFeature now and it will complete.

image

Sorted!

Note: I have found in my testing that IIS behaves poorly in TPv3. This might be why Microsoft’s getting started guides on MSDN use nginx web server instead of IIS! I’ve confirmed that nginx works perfectly well.

Why Are My Windows Server Containers Not On The Network?

I find containers are easy to create and it’s pretty simple to build a library of container images. At least, that’s what I found when I got to play with containers for the first time on a pre-build Windows Server 2016 (WS2016) Technical Preview 3 (TPv3) lab. But I started playing with containers for the first time in my own lab in the last few days and I had some issues; the thing I had never done was create a host, a VM host to be precise (a Hyper-V VM that will host many containers), by myself. In this post I’ll explain how, by default, my containers were not networked and how I fixed it. This post was written for the TPv3 release, and Microsoft might fix things in later releases, but you might find some troubleshooting info that might be of help here.

Some Theory

I guess that most people will deploy Windows Server containers in virtual machines. If you work in the Hyper-V world then you’ll use Hyper-V VMs. In this timeframe the documented process for creating a VM host is to download and run a script called New-ContainerHost.PS1. You can get that by running:

wget -uri https://aka.ms/newcontainerhost -OutFile New-ContainerHost.ps1

You’ll get a script that you download and then you’re told by Microsoft and every other blog that copied & pasted without testing to run:

.\New-ContainerHost.ps1 –VmName <NewContainerHostVMName> -Password <NewContainerHostVMPassword>

What happens then?

  • A bunch of stuff is downloaded in a compressed file, including a 12 GB VHD called WindowsServer_en-us_TP3_Container_VHD.vhd.
  • The VHD is mounted and some files are dropped into it, including Install-ContainerHost.ps1
  • A new VM is created. The C: drive is a differencing VHD that uses the downloaded VHD as the parent
  • The VM is booted.
  • When the VM is running, Install-ContainerHost is run, and the environment is created in the VM.
  • Part of this is the creation of a virtual switch inside the VM. Here’s where things can go wrong by default.
  • The script completes and it’s time to get going.

What’s the VM switch inside a VM all about? It’s not just your regular old VM switch. It’s a NATing switch. The idea here is that containers that will run inside of the VM will operate on a private address space. The containers connect to the VM switch which provides the NATing functionality. The VM switch is connected to the vNIC in the VM. The guest OS of the VM is connected to the network via a regular old switch sharing process (a management OS vNIC in the guest OS).

What Goes Wrong?

Let’s assume that you have read some blogs that were published very quickly on the topic of containers and you’ve copied and pasted the setup of a new VM host. I tried that. Let’s see what happened … there were two issues that left me with network-disconnected containers:

Disconnected VM NIC

Almost every example I saw of New-ContainerHost fails to include 1 necessary step: specify the name of a virtual switch on the host to connect the VM to. You can do this after the fact, but I prefer to connect the VM straight away. This cmdlet adds a flag to specify which host to connect the VM to. I’ve also added a cmdlet to skip the installation of Docker.

.\New-ContainerHost.ps1 –VmName <newVMName> –Password <NewVMPassword> -SkipDocker –SwitchName <PhysicalHostSwitch>

This issue is easy enough to diagnose – your VM’s guest OS can’t get a DHCP address so you connect the VM’s vNIC to the host’s virtual switch.

New-NetNat Fails

This is the sticky issue because it deals with new stuff. New-NetNat will create:

… a Network Address Translation (NAT) object that translates an internal network address to an external network address. NAT modifies IP address and port information in packet headers.

Fab! Except it kept failing in my lab with this error:

Net-NetNat : No Matching interface was found for prefix (null).

image

This wrecked my head. I was about to give up on Containers when it hit me. I’d already tried building my own VM and I had downloaded and ran a script called Install-ContainerHost in a VM to enable Containers. I logged into my VM and there I found Install-ContainerHost on the root of C:. I copied it from the VM (running Server Core) to another machine with a UI and I edited it using ISE. I searched for 172.16.0.0 and found a bunch of stuff for parameters. A variable called $NATSubnetPrefix was set to “172.16.0.0/12”.

There was the issue. My lab’s network address is 172.16.0.0/16; this wasn’t going to work. I needed a different range to use behind the NATing virtual switch in the container VM host. I edited the variable to define a network address for NATing of “192.168.250.0/24”:

image

 

I removed the VM switch and then re-ran Install-ContainerHost in the VM. The script ran perfectly. Let’s say the VM had an address of 172.16.250.40. I logged in and created a new container (the container OS image is on the C:). I used Enter-PSRemote to log into the container and I saw the container had an IP of 192.168.250.2. This was NATed via the virtual switch in the VM, which in turn is connected to the top-of-rack switch via the physical host’s virtual switch.

Sorted. At least, that was the fix for a broken new container VM host. How do I solve this long term?

I can tell you that mounting the downloaded WindowsServer_en-us_TP3_Container_VHD.vhd and editing New-ContainerHost there won’t work. Microsoft appears to download it every time into the differencing disk.

The solution is to get a copy of Install-ContainerHost.PS1 (from the VHD) and save it onto your host or an accessible network location. Then you run New-ContainerHost with the –ScriptPath to specify your own copy of Install-ContainerHost. Here’s an example where I saved my edited (the new NAT network address) copy of Install-ContainerHost on CSV1:

.\New-ContainerHost.ps1 –VmName NewContainerVM –Password P@ssw0rd -SkipDocker -SwitchName SetSwitch -ScriptPath "C:\ClusterStorage\CSV1\Install-ContainerHost.ps1"

That runs perfectly, and no hacks are required to get containers to talk on the network. I then successfully deployed IIS in a container, enabled a NATing rule, and verified that the new site was accessible on the LAN.

MS15-105 – Vulnerability in Windows Hyper-V Could Allow Security Feature Bypass

Microsoft released a security hotfix for Hyper-V last night. They describe it as:

This security update resolves a vulnerability in Microsoft Windows. The vulnerability could allow security feature bypass if an attacker runs a specially crafted application that could cause Windows Hyper-V to incorrectly apply access control list (ACL) configuration settings. Customers who have not enabled the Hyper-V role are not affected.

This security update is rated Important for all supported editions of Windows 8.1 for x64-based Systems, Windows Server 2012 R2, and Windows 10 for x64-based Systems. For more information, see the Affected Software section.

The security update addresses the vulnerability by correcting how Hyper-V applies ACL configuration settings. For more information about the vulnerability, see the Vulnerability Information section.

KB3091287 does go into any more detail.

CVE-2015-2534 simply says:

Hyper-V in Microsoft Windows 8.1, Windows Server 2012 R2, and Windows 10 improperly processes ACL settings, which allows local users to bypass intended network-traffic restrictions via a crafted application, aka “Hyper-V Security Feature Bypass Vulnerability.”

Affected OSs are:

  • Windows 10
  • Windows 8.1
  • Windows Server 2012 R2

No Windows 8 or WS2012 – that makes me wonder if this is something to do with Extended Port ACLs.

Credit: Patrick Lownds (MVP) for tweeting the link.

ReFS Accelerated VHDX Operations

One of the interesting new features in Windows Server 2016 (WS2016) is ReFS Accelerated VHDX Operations (which also work with VHD). This feature is not ODX (VAAI for you VMware-bods), but it offers the same sort of benefits for VHD/X operations. In other words: faster creation and copying of VHDX files, particularly fixed VHDX files.

Reminder: while Microsoft continually tells us that dynamic VHD/Xs are just as fast as fixed VHDX files, we know from experience that the fixed alternative gives better application performance. Even some of Microsoft’s product groups refuse to support dynamic VHD/X files. But the benefit of Dynamic disks is that they start out as a small file that is extended as time requires, but fixed VHDX files take up space immediately. The big problem with fixed VHD/X files is that they take an age to create or extend because they must be zeroed out.

Those of you with a nice SAN have seen how ODX can speed up VHD/X operations, but the Microsoft world is moving (somewhat) to SMB 3.0 storage where there is no SAN for hardware offloading.

This is why Microsoft has added Accelerated VHDX Operations to ReFS. If you format your CSVs with ReFS then ReFS will speed up the creation and extension of the files for you. How much? Well this is why I built a test rig!

The back-end storage is a pair of physical servers that are SAS (6 Gb) connected to a shared DataON DNS-1640 JBOD with tiered storage (SSD and HDD); I built a WS2016 TPv3 Scale-Out File Server with 2 tiered virtual disks (64 KB interleave) using this gear. Each virtual disk is a CSV in the SOFS cluster. CSV1 is formatted with ReFS and CSV2 is formatted with NTFS, 64 KB allocation unit size on both. Each CSV has a file share, named after the CSV.

I had another WS2016 TPv3 physical server configured as a Hyper-V host. I used Switch Embedded Teaming to aggregate a pair of iWARP NICs (RDMA/SMB Direct, each offering 10 GbE connectivity to the SOFS) and created a pair of virtual NICs in the host for SMB Multichannel.

I ran a script on the host to create fixed VHDX files against each share on the SOFS, measuring the time it requires for each disk. The disks created are of the following sizes:

  • 1 GB
  • 10 GB
  • 100 GB
  • 500 GB

Using the share on the NTFS-formatted CSV, I had the following results:

image

A 500 GB VHDX file, nothing that unusual for most of us, took 40 minutes to create. Imagine you work for an IT service provider (which could be a hosting company or an IT department) and the customer (which can be your employer) says that they need a VM with a 500 GB disk to deal with an opportunity or a growing database. Are you going to say “let me get back to you in an hour”? Hmm … an hour might sound good to some but for the customer it’s pretty rubbish.

Let’s change it up. The next results are from using the share on the ReFS volume:

image

Whoah! Creating a 500 GB fixed VHDX now takes 13 seconds instead of 40 minutes. The CSVs are almost identical; the only difference is that one is formatted with ReFS (fast VHD/X operations) and the other is NTFS (unenhanced). Didier Van Hoye has also done some testing using direct CSV volumes (no SMB 3.0), comparing Compellent ODX and ReFS. What the heck is going on here?

The zero-ing out process that is done while creating a fixed VHDX has been converted into a metadata operation – this is how some SANs optimize the same process using ODX. So instead of writing out to the disk file, ReFS is updating metadata which effectively says “nothing to see here” to anything (such as Hyper-V) that reads those parts of the VHD/X.

Accelerated VHDX Operations also works in other subtle ways. Merging a checkpoint is now done without moving data around on the disk – another metadata operation. This means that merges should be quicker and use fewer IOPS. This is nice because:

  • Production Checkpoints (on by default) will lead to more checkpoint usage in DevOps
  • Backup uses checkpoints and this will make backups less disruptive

Does this feature totally replace ODX? No, I don’t think it does. Didier’s testing proves that ReFS’s metadata operation is even faster than the incredible performance of ODX on a Compellent. But, the SAN offers more. ReFS is limited to operations inside a single volume. Say you want to move storage from one LUN to another? Or maybe you want to provision a new VM from a VMM library? ODX can help in those scenarios, but ReFS cannot. I cannot say yet if the two technologies will be compatible (and stable together) at the time of GA (I suspect that they will, but SAN OEMs will have the biggest impact here!) and offer the best of both worlds.

This stuff is cool and it works without configuration out of the box!

Starting Lab Work With WS2016 TPv3

You might have assumed that I’ve had Windows Server 2016 (WS2016) running in my lab since TPv1 was launched. Well, that would have been nice but although I spend more time in a lab than most, I didn’t have time/resources. All I had time to play with was a virtual S2D SOFS using VMs and VHDX files.

What resources I have had to be allocated toWS2012 R2 because that’s what’s used by most people. For my writing in Petri.com, I’ve stayed mostly with WS2012 R2 because WS2016 is still too fluid.

My day job has been 95% Azure since January of last year so that’s consumed a lot of time. Any hybrid stuff I’ve been doing has required a GA OS so that’s why I’ve had so much WS2012 R2.

But in the last few weeks (sandwiching some vacation time) I’ve been deploying WS2016 in the lab. Right now I have:

  • Some VMs running Windows 10 with RSAT and a WS2016 DC
  • A SOFS running WS2016 with a DataON DNS-1640
  • A pair of Hyper-V hosts using the SOFS (SMB 3.x) and StarWind (iSCSI) for storage

There’s plenty of fun stuff to start looking at. Things I want to play with are Network Controller and Containers. I’ve already had a play with Switch Embedded Teaming – it’s pretty easy to set up. More to come!

Technorati Tags: ,