Ignite 2015–What’s New in Windows Server Hyper-V

Speakers: Ben Armstrong & Sarah Cooley

This is a detailed view of everything you can do with Hyper-V in Windows Server 2016 TPv2 build. 14 demos. This is not a complete overview of everything in the release. This is what you can realistically do in labs with the build at the moment. A lot of the features are also in Windows 10.

Nano Server

Cloud-first refactoring. Hyper-V and storage are the two key IaaS scenarios for Nano Server.

Containers

Hyper-V can be used to deploy containers. Not talking about in this session – there was another session by Taylor Brown on this. Not in this build – coming in the future.

Making Cloud Great

This is how the Hyper-V team thinks: everything from Azure, public, private and small “clouds”.

Virtual Machine Protection:

Trust in the cloud is biggest blocker to adoption. Want customers to know that their data is safe.

A virtual TPM can be injected into a VM. Now we can enable BiLocker in the VM and protect data from anyone outside of the VM. I can run a VM on someone else’s infrastructure and they cannot see or use my data.

Secure boot is enabled for Linux. The hardware can verify that the kernel mode code is uncompromised. Secure boot is already in Windows guest OSs in WS2012 R2.

Shielded VMs

Virtual TPM is a part of this story. This is a System Center & Hyper-V orchestrated solution for highly secure VMs. Shielded VMs can only run in fabrics that are designated as owners of that VM.

Distributed Storage QoS

See my previous post.

Host Resource Protection

Dynamically detect VMs that are not “playing well” and reduce their resource allocation. Comes from Azure. Lots of people deploy VMs and do everything they can to break out and attack Azure. No one has ever broken out, but their attempts eat up a lot of resources. HRP detects “patterns of access”, e.g. loading kernel code that attacks the system, to reduce their resource usage. A status will appear to say that HRP has been enabled on this VM.

Storage and Cluster Resiliency

What happens when the network has a brief glitch between cluster nodes? This can cause more harm than good by failing over and booting up the VMs again – can take longer than waiting out the issue.

Virtual Machine Cluster Resiliency:

  • Cluster doesn’t jump to failover after immediate time out.
  • The node goes into isolated state and VM goes unmonitored.
  • If the node returns in under 4 minutes (default) then the node returns and VM goes back to running state.
  • If a host is flapping, the host is put into a quarantine. All VMs will be live migrated off of the node to prevent issues.

Storage Resiliency:

  • If the storage disappears: the VM is paused ahead of a timeout to prevent a crash.
  • Once the storage system resumes, the VM un-pauses and IOPS continues.

Shared VHDX

Makes it easy to do guest clustering. But WS2012 R2 is v1.0 tech. Can’t do any virtualization features with it, e.g. backup, online resize.

In TPv2, starting to return features:

  • Host-based, no agent in the guest, backup of guest clusters with shared VHDX.
  • You will also be able to do online resizing of the shared VHDX.
  • Shared drive has it’s own h/w category when you Add Hardware in VM settings. Underlying mechanism is the exact same, just making the feature more obvious.

VHDS is the extension of shared VHDX files.

Hyper-V Replica & Hot-Add

By default, a newly added disk won’t replicated. Set-VMReplication –ReplicatedDisks (Get-VMHardDiskDrive VM01) will add a disk to the replica set.

Behind the scenes there is an initial copy happening for the new disk while replication continues for the original disks.

Runtime Memory Resize

You can:

  • Resize the memory of a VM with static RAM while it running.
  • You can see the memory demand of static RAM VMs – useful to resize.

Hot Add/Remove Network Adapters

This can be done with Generation 2 VMs.

Rolling Cluster Upgrade

No need to build a new cluster to deploy a new OS. You actually rebuild 1 host at a time inside the cluster. VMs can failover and live migrate. You need WS2012 R2 to start off. Once done, you upgrade the version of the cluster to use new features. You can also rollback a cluster from WS2016 to WS2012 R2.

New VM Upgrade Process

Previous versions of Hyper-V automatically upgraded a VM automatically once it was running on a new version of Hyper-V. This has changed.

There is now a concept of a VM configuration version. It is not upgraded automatically – done manually. This is necessary to allow rollback from Cluster Rolling Upgrade.

Version 5.0 is the configuration version of WS2012 R2. Version 2.1a was WS2012 R2 SP1. The configuration version was always there for internal usage, and was not displayed to users. In TPv2 they are 6.2.

A VM with v5.0 works with that host’s features. A v5.0 VM on WS2016 runs with compatibility for WS2012 R2 Hyper-V. No new features are supplied to that VM. Process for manually upgrading:

  1. Shutdown the VM
  2. Upgrade the VM config version via UI or PoSH
  3. Boot up again – now you get the v6.2 features.

Production Checkpoints

Uses VSS in the guest OS instead of saved state to create checkpoint. Restoring a production checkpoint is just like restoring a system backup. S/W inside of the guest OS, like Exchange or SQL Server, understand what to do when they are “restored from backup”, e.g. replay logs, etc.

Now this is a “supported in production” way to checkpoint production VMs that should reduce support calls.

PowerShell Direct

You can run cmdlets against the guest OS via the VMBus. Easier administration – no need for network access.

ReFS Accelerated VHDX Operations

Instant disk creation and checkpoint merging. Ben created a 5TB fixed VHDX w/o ODX and it took 22 hours.

Creating 1GB disk. Does a demo of 1 GB disk on non-accelerated volume on same physical disks takes 71 seconds on ReFS and it takes: 4.77 seconds. 50 GB takes 3.9 seconds.

DOes a merge on non-accelerated volume and it takes 68 seconds. Same files on ReFS and it takes 6.9 seconds. This has a huge impact on backup of large volumes – file-based backup uses checkpoints and merge. There is zero data copy involved.

Hyper-V Manager and PoSh Improvements

  • Support for alternate credentials
  • Connecting via IP address
  • Connecting via WinRM

There’s a demo to completely configure IIS and deploy/start a website from an admin machine without logging into the VM, using PowerShell Direct with no n/w access.

Cross-Version Management

You can manage WS2012 and WS2012 R2 hosts with Hyper-V Manager. There are two versions of PowerShell 1.1 and 2.0.

Integration Services

Insert Integration Components is gone from the UI. It did not scale out. VM Drivers re updated via Windows Update (critical update). Updates go to VMs on correct version of Hyper-V.

Hyper-V Backup

File-based backup and built-in change tracking. No longer dependent on h/w snapshots, but able to use them if they are there.

VM Configuration Changes

New configuration file format. Moving to binary format away from XML for performance efficiency when you have thousands of VMs. New file extensions:

  • VMCX:
  • VMRS:

This one was done for Azure, and trickles down to us. Also solves the problem of people editing the XML which was unsupported. Everything can be done via PowerShell anyway.

Hyper-V Cluster Management

A new under-the-covers administration model that abstracts the cluster. You can manage a cluster like a single host. You don’t need to worry about cluster resource and groups to configure VMs anymore.

Updated Power Management

Conencted Standby Works

RemoteFX

OpenGL 4.4 and OpenCL 1.1 API supported.

Ignite 2015–Hyper-V Storage Performance with Storage Quality of Service

I am live blogging this session so hit refresh to see more.

Speakers: Senthil Rajaram and Jose Barreto.

This session is based on what’s in TPv2. There is a year of development and FEEDBACK left, so things can change. If you don’t like something … tell Microsoft.

Storage Performance

  1. You need to measure to shape
  2. Storage control allows shaping
  3. Monitoring allows you to see the results – do you need to make changes?

Rules

  • Maximum Allowed: Easy – apply a cap.
  • Minimum Guaranteed: Not easy. It’s a comparative value to other flows. How do you do fair sharing? A centralized policy controller avoids the need for complex distributed solutions.

The Features in WS2012 R2

There are two views of performance:

  • From the VM: what the customer sees – using perfmon in the guest OS
  • From the host: What the admin sees – using the Hyper-V metrics

VM Metrics allow performance data to move with a VM. (get-vm –name VM01)  | Measure-VM).HardDiskMetrics …. it’s Hyper-V Resource Metering – Enable-VMResourceMetering.

Normalized IOPS

  • Counted in 8K blocks – everything is a multiple of 8K.
  • Smaller than 8K counts as 1
  • More than 8K counted in multiples, e.g 9K = 2.

This is just an accounting trick. Microsoft is not splitting/aggregating IOs.

Used by:

  • Hyper-V Storage Performance Counters
  • Hyper-V VM Metrics (HardDiskMetrics)
  • Hyper-V Storage QoS

Storage QoS in WS2012 R2

Features:

  • Metrics – per VM and VHD
  • Maximum IOPS per VHD
  • Minimum IOPS per VHD – alerts only

Benefits:

  • Mitigate impact of noisy neighbours
  • Alerts when minimum IOPS are not achieved

Long and complicated process to diagnose storage performance issues.

Windows Server 2016 QoS Instroduction.

Moving from managing IOPS on the host/VM to managing IOPS on the storage system.

Simple storage QoS system that is installed in the base bits. You should be able to observe performance for the entire set of VMs. Metrics are automatically collected, and you can use them even if you ar enot using QoS. No need to log into every node using the storage subsystem to see performance metrics. Can create policies per VM, VHD, service or tenant. You can use PoSH or VMM to manage it.

This is a SOFS solution. One of the SOFS nodes is elected as the policy manager – a HA role. All of the nodes in the cluster share performance data, and the PM is the “thinker”.

  1. Measure current capacity at the compute layer.
  2. Measure current capacity at the storage layer
  3. use algorithm to meet policies at the policy manager
  4. Adjust limits and enforce them at the compute layer

In TP2, this cycle is done every 4 seconds. Why? Storage and workloads are constantly changing. Disks are added and removed. Caching makes “total IOPS” impossible to calculate. The workloads change … a SQL DB gets a new index, or someone starts a backup. Continuous adjustment is required.

Monitoring

On by default You can query the PM to get a summary of what’s going on right now.

Available data returned by a PoSH object:

  • VHD path
  • VM Name
  • VM Host name
  • VM IPOS
  • VM latency
  • Storage node name
  • Storage node IOPS
  • Storage node latency

Get-StorageQoSFlow – performance of all VMs using this file server/SOFS

Get0StorageQoSVolume – performance of each volume on this file server/SOFS

There are initiator (the VM’s perspective) metric and storage metrics. Things like caching can cause differences in initiator and storage metrics.

Get-StorageQoSFlow | Sort InitiatorIPOS | FT InitiarorName, InitiatorIIOPS, InitiatorLatency

Working not with peaks/troughs but with averages over 5 minutes. The Storage QoS metrics, averaged over the last 5 minutes, are rarely going to match the live metrics in perfmon.

You can use this data: export to CSV, open in Excel pivot tables

Deploying Policies

Three elements in a policy:

  • Max: hard cap
  • Min: Guaranteed allocation if required
  • Type: Single or Multi-instance

You create policies in one place and deploy the policies.

Single instance: An allocation of IOPS that are shared by a group of VMs. Multi-instance: a performance tier. Every VM get’s the same allocation, e.g. max IOPS=100 and each VM gets that.

Storage QoS works with Shared VHDX

Active/Active: Allocation split based on load. Active/Passive: Single VM can use full allocation.

This solution works with Live Migration.

Deployment with VMM

You can create and apply policies in VMM 2016. Creaate in Fabric > Storage > QoS Policies. Deploy in VM Properties > Hardware Configuration > <disk> > Advanced. You can deploy via a template.

PowerShell

New-StorageQoSPolicy –CimSession FS1 –Name sdjfdjsf –PolicyType MultiInstance – MaximumIOPS 200

Get-VM –Name VM01 | Get-VMHardDiskDrive | Set-VMHardDiskDrive –QosPolicy $Policy

Get-StorageQoSPolicy –Name sdfsdfds | Get-StorageQoSFlow … see data on those flows affected by this policy. Pulls data from the PM.

Demo

The way they enforce max IOPS is to inject latency in that VM’s storage. This reduces IOPS.

Designing Policies

  • No policy: no shaping. You’re just going to observe uncontrolled performance. Each VM gets at least 1 IOPS
  • Minimum Only: A machine will get at least 200 IOPS, IF it needs it. VM can burst. Not for hosters!!! Don’t set false expectations of maximum performance.
  • Maximum only: Price banding by hosters or limiting a noisy neighbour.
  • Minimum < Maximum, e.g. between 100-200: Minimum SLA and limited max.
  • Min = Max: VM has a set level of performance, as in Azure.

Note that VMs do not use min IOPS if they don’t have the workload for it. It’s a min SLA.

Storage Health Monitoring

If total Min of all disks/VMs exceeds the storage system then:

  • QoS does it’s best to do fair share based on proportion.
  • Raises an alert.

In WS2016 there is 1 place to get alerts for SOFS called Storage health Monitoring. It’s a new service on the SOFS cluster. You’ll get alerts on JBOD fans, disk issues, QoS, etc. The alerts are only there while the issue is there, i.e. if the problem goes away then the alert goes away. There is no history.

Get-StorageSubSystem *clsuter* | Debug-StorageSubSystem.

You can register triggers to automate certain actions.

Right now – we spend 10x more than we need to to ensure VM performance. Storage QoS reduces spend by using a needle to fix issues instead of a sledge hammer. We can use intelligence to solve performance issues instead of a bank account.

In Hyper-V converged solution, the PM and rate limiters live on the same tier. Apparently there will be support for a SAN – I’m unclear on this design.

Ignite 2015 – Platform Vision & Strategy Network Overview

Speakers: Yousef Khaladi, Rajeev nagar, Bala Rajagopalan

I could not get into the full session on server virtualization strategy – meanwhile larger rooms were 20% occupied. I guess having the largest business in Microsoft doesn’t get you a decent room. There are lots of complaints about room organization here. We could also do with a few signs and some food.

Yousef Khaladi – Azure Networking

He’s going to talk about the backbone. Features:

  • Hyper-scale
  • Enterprise grade
  • Hybrid

There are 19 regions which are bigger than AWS and Google combined. There are 85 iXP points, 4400+ connections to 1695 networks. There are 1.4 million miles of fiber in Azure. The NA fiber can wrap around the world 4 times. Microsoft has 15 billion dollars in cloud investment. Note: in Ireland, the Azure connection comes in through Derry.

Azure has automated provisioning with integrated process with L3 at all layers. It has automated monitoring and remediation with low human involvement.

They have moved intelligence from locked in switch vendors to the SDN stack. They use software load balancers in the fabric.

Layered support:

  1. DDOS
  2. ACLs
  3. Viftual network isolation
  4. NSG
  5. VM firewall

Network security groups (NSGs):

  • Network ACLs that can be assigned to subnets or VMs
  • 5-tuple rules
  • Enables DMZ subnets
  • Updated independent of VMs

Build an n-tier application in a single virtual network and isolate the public front end using NSGs.

ExpressRoute:

  • Now supports Office 365 and Skype for Business
  • The Premium Add-on adds virtual network global connectivity, up to 10,000 routes (instead of 4000) and up to 100 connected virtual networks

Cloud Inspired Infrastructure

It takes time to deploy a service on your own infrastructure. The processes are there as a caution against breaking already complicated infrastructure. You can change this with SDN.

Today’s solution first: Lots of concepts and pretty pictures. Not much to report.

New Stuff

VXLAN is coming to Microsoft SDN. They are taking convergence a step further. RDMA storage NICs can be converged and also used for tenant traffic. There will be a software load balancer. There will be a control layer in WS2016 called a network controller. This is taken from Azure. There is a distributed load balancer and software load balancer in the fabric.

IPAM can handle multiple AD forests. IPAM adds DNS management across multiple forests.

Back to RDMA – if you’re using RDMA then you cannot converge it on WS2012 R2. That means you have to deploy extra NICs for VMs, In WS2016, you can enable RDMA on management OS vNICs. This means you can converge those NICs for VM and host traffic.

TrafficDirect moves interrupt handing from the parent partition to the virtual switch where it can be handled more efficiently. In a stress test, he doubles traffic into a VM via a stress test, over 3+ million packets per second.

Summary

The networking of Azure is coming to on-premises in WS2016 and the Azure Stack. This SDN frees you from the inflexibility of legacy systems. We get additional functionality that will increase security and HA, while reducing costs.

All The Details On My Two Ignite Sessions

Thanks (I think!!!) to John at MicroWarehouse (my employer) for sticking this on the company website:

image

I think he even Photoshop slimmed me Smile

Here’s the details of both my sessions:

The Hidden Treasures of Windows Server 2012 R2 Hyper-V

  • When: 5:00PM – 6:15PM, Tuesday, May 5th
  • Where: E451A
  • Session code: BRK3506

My first session is a 75 minute level 300 session focusing on lesser known features of the version of Hyper-V that you can deploy now, and leaves you in the best position to upgrade to vNext. Don’t worry if you’ve seen by TEE14 session; this one is 50% different with some very useful stuff that I’ve never presented on or blogged about before.

It’s one thing to hear about and see a great demo of a Hyper-V feature. But how do you put them into practice? This session takes you through some of those lesser-known elements of Hyper-V that have made for great demonstrations, introduces you to some of the lesser-known features, and shows you best practices, how to increase serviceability and uptime, and design/usage tips for making the most of your investment in Hyper-V.

 

End-to-End Azure Site Recovery Solutions for Small & Medium Enterprises

  • When: 12:05PM – 12:25PM, Thursday, May 7th
  • Where: EXPO: Lounge C Theater
  • Session Code: THR0903

My second session is 20 minutes on Azure DR solutions for SMEs in the community theatre. I’ve done lots of lab and proof-of-concept work with ASR in the SME space and this presentation focuses on the stuff that no one talks about – it’s easy to replicate VMs, but what about establishing services, accessing failed over VMs, and more?!?!?

In this session I will share some tips and lessons that I have learned from working with Azure Site Recovery services to provide a complete disaster recovery solution in Azure for Hyper-V virtual machines in a small/medium enterprise.

Microsoft News – 23 April 2015

I’ve been really busy either preparing training, delivering training, on customer sites, or prepping my two sessions for Ignite. Here’s the roundup of recent Microsoft news for infrastructure IT pros:

Hyper-V

Windows Server

Windows 10

Azure

Office 365

Intune

Miscellaneous

Altaro Webinar Recording and Slides – What’s New in Hyper-V vNext

I recently co-presented a webinar by Altaro with Rick Claus (Microsoft) and Andrew Syrewicze (MVP) on what’s coming in the next version of Windows Server Hyper-V. Altaro has a recording of the webinar online. That page will be updated soon with a written Q&A from the ssession; we had A LOT of questions and Altaro asked me to write out responses which I did last Friday night. You can also download a PDF copy of the slides from the session.

Thank you to everyone that joined us. We had a great number of people tuned in – I was stunned when the folks at Altaro broke down the numbers. Hopefully, I’ll see some of you tomorrow night in the webinar I am co-presenting for StarWind on using ODX or VAAI to enhance storage performance for Hyper-V or vSphere respectively.

Survey Results – What UI Option Do You Use For Hyper-V Hosts?

Thank you to the 424 (!) people who answered the survey that I started late on Friday afternoon and finished today (Tuesday morning). I asked one question:

What kind of UI installation do you use on Hyper-V hosts?

  • The FREE Hyper-V Server 2012 R2
  • Full UI
  • MinShell
  • Core

Before I get to the results …

The Survey

Me and some other MVPs used to do a much bigger annual survey. The work required by us was massive, and the amount of questions put people off. I kept this very simple. There were no “why’s” or further breakdowns of information. This lead to a bigger sample size.

The Sample

We got a pretty big sample size from all around the world, with results from the EU, USA and Canada, eastern Europe, Asia, Africa, the south Pacific, and south America. That’s amazing! Thank you to everyone who helped spread the word. We got a great sample in a very short period of time.

image

However (there’s always one of these with surveys!), I recognize that the sample is skewed. Anyone, like you, who reads a blog like this, follows influencers on social media, or regularly attends something like a TechNet/Ignite/community IT pro events is not a regular IT pro. You are more educated and are not 100% representative of the wider audience. I suspect that more of you are using non-Full UI options (Hyper-V Server, MinShell or Core) than in the wider market.

Also, some of you who answered this question are consultants or have more complex deployments with a mixture of installations. I asked you to submit your most common answer. So a consultant that selects X might have 15 customers with X, 5 with Y and 2 with Z.

The Results

So, here are the results:

image

 

70% of the overall sample chose the full UI for the management OS of their Hyper-V hosts. If we discount the choice of Hyper-V Server (they went that way for specific economic reasons and had no choice of UI) then the result changes.

Of those who had a choice of UI when deploying their hosts, 79% went with the Full UI, 5.5% went with MinShell, and 15% went with Server Core. These numbers aren’t much different to what we saw with W2008 R2, with the addition of MinShell taking share from Server Core. Despite everything Microsoft says, customers have chosen easier management and troubleshooting by leaving the UI on their hosts.

image

Is there a specific country bias? The biggest response came from the USA (111):

  • Core: 19.79%
  • MinShell: 4.17%
  • Full UI: 76.04%

In the USA, we find more people than average (but still a small minority) using Core and MinShell. Next I compared this to Great Britain, Germany, Austria, Ireland, The Netherlands, Sweden, Belgium, Denmark, Norway, Slovenia, France and Poland (not an entire European sample but a pretty large one from the top 20 responding countries, coming in at a total of 196 responses):

  • Core: 13.78%
  • MinShell: 4.08%
  • Full UI: 82.14%

It is very clear. The market has spoken and the market has said:

  • We like that we have the option to deploy Core or MinShell
  • But most of us want a Full UI

Those of you who selected Hyper-V Server did not waste your time. There are very specific and useful scenarios for this freely licensed product. And Microsoft loves to hear that their work in maintaining this SKU has a value in the market. To be honest, I expect this number (10.59%) to gradually grow over time as those without Software Assurance choose to opt into new Hyper-V features without upgrading their guest OS licensing.

My Opinion

I have had one opinion on this matter since I first tried a Core install for Hyper-V during the beta of Windows Server 2008. I would only ever deploy a Full UI. If (and it’s a huge IIF), I managed a HUGE cloud with HA infrastructure then I would deploy Nano Server on vNext. But in every other scenario, I would always choose a Full UI.

The arguments for Core are:

  • Smaller installation: Who cares if it’s 6GB or 16 GB? I can’t buy SD cards that small anymore, let alone hard disks!!!
  • Smaller attack footprint: You deserve all the bad that can happen if you read email or browse from your hosts.
  • Fewer patches: Only people who don’t work in the real world count patches. We in the real world count reboots, and there are no reductions. To be honest, this is irrelevant with Cluster Aware Updating (CAU).
  • More CPU: I’ve yet to see a host in person where CPU is over 33% average utilisation.
  • Less RAM: A few MB savings on a host with at least 64 GB (rare I see these anymore) isn’t going to be much benefit.
  • You should use PowerShell: Try using 3rd party management or troubleshooting isolated hosts with PowerShell. Even Microsoft support cannot do this.
  • Use System Center: Oh, child! You don’t get out much.
  • It stops admins from doing X: You’ve got other problems that need to be solved.
  • You can add the UI back: This person has not patched a Core install over several months and actually tried to re-add the UI – it is not reliable.

In my experience, and that of most people. servers are not cattle; they are not pets either; no – they are sacred cows (thank you for finding a good ending to that phrase, Didier). We cannot afford to just rebuild servers when things go wrong. They do need to be rescued and trouble needs to be fixed. Right now, the vast majority of problems I hear about are network card driver and firmware related. Try solving those with PowerShell or remote management. You need to be on the machine and solving these issues and you need a full UI. The unreliable HCL for Windows Server has lead to awful customer experiences on Broadcom (VMQ enabled and faulty) and Emulex NICs (taking nearly 12 months to acknowledge the VMQ issue on FCoE NICs).

Owning a host is like owning a car. Those who live in the mainstream have a better experience. Things work better. Those who try to find cheaper alternatives, dare to be different, find other sources … they’re the ones who call for roadside assistance more. I see this even in the Hyper-V MVP community … those who dare to be on the ragged edge of everything are the ones having all the issues. Those who stay a little more mainstream, even with the latest tech, are the ones who have a reliable infrastructure and can spend more time focusing on getting more value out of their systems.

Another survey will be coming soon. Please feel free to comment your opinions on the above and what you might like to see in a survey. Remember, surveys need closed answers with few options. Open questions are 100% useless in a survey.

What about Application Servers?

That’s the subject of my next survey.

Using This Data

Please feel free to use the results of the survey if:

  • You link back to this post
  • You may use 1 small quote from this post

My Take On Windows Nano Server & Hyper-V Containers

Microsoft made two significant announcements yesterday, further innovating their platform for cloud deployments.

Hyper-V Containers

Last year Microsoft announced a partnership with Docker, a leader in application containerization. The concept is similar to Server App-V, the now deprecated service virtualization solution from Microsoft. Instead of having 1 OS per app, containers allow you to deploy multiple applications per OS. The OS is shared, and sets of binaries and libraries are shared between similar/common apps.

Hypervisor versus application containers

These containers are can be deployed on a physical machine OS or within the guest OS of a virtual machine. Right now, you can deployed Docker app containers onto Ubuntu VMs in Azure, that are managed from Windows.

Why would you do this? Because app containers are FAST to deploy. Mark Russinovich demonstrated a WordPress install being deployed in a second at TechEd last year. That’s incredible! How long does it take you to deploy a VM? File copies are quick enough, especially over SMB 3.0 Direct Access and Multichannel, but the OS specialisation and updates take quite a while, even with enhancements. And Azure is actually quite slow, compared to a modern Hyper-V install, at deploying VMs.

Microsoft use the phrase “at the speed of business” when discussing containers. They want devs and devops to be able to deploy applications quickly, without the need to wait for an OS. And it doesn’t hurt, either, that there are fewer OSs to manage, patch, and break.

Microsoft also announced, with their partnership with Docker, that Windows Server vNext would offer Windows Server Containers. This is a means of app containers that is native to Windows Server, all manageable via the Microsoft and Docker open source stack.

But there is a problem with containers; they share a common OS, and sets of libraries and binaries. Anyone who understands virtualization will know that this creates a vulnerability gateway … a means to a “breakout”. If one application container is successfully compromised then the OS is vulnerable. And that is a nice foothold for any attacker, especially when you are talking about publicly facing containers, such as those that might be in a public cloud.

And this is why Microsoft has offered a second container option in Windows Server vNext, based on the security boundaries of their hypervisor, Hyper-V.

Windows Server vNext offers Windows Containers and Hyper-V Containers

Hyper-V provides secure isolation for running each container, using the security of the hypervisor to create a boundary between each container. How this is accomplished has not been discussed publicly yet. We do know that Hyper-V containers will share the same management as Windows Server containers and that applications will be compatible with both.

Nano Server

It’s been a little while since a Microsoft employee leaked some details of Nano Server. There was a lot of speculation about Nano, most of which was wrong. Nano is a result of Microsoft’s, and their customers’, experiences in cloud computing:

  • Infrastructure and compute
  • Application hosting

Customers in these true cloud scenarios have the need for a smaller operating system and this is what Nano gives them. The OS is beyond Server Core. It’s not just Windows without the UI; it is Windows without the I (interface). There is no logon prompt and no remote desktop. This is a headless server installation option, that requires remote management via:

  • WMI
  • PowerShell
  • Desired State Configuration (DSC) – you deploy the OS and it configures itself from a template you host
  • RSAT (probably)
  • System Center (probably)

Microsoft also removed:

  • 32 bit support (WOW64) so Nano will run just 64-bit code
  • MSI meaning that you need a new way to deploy applications … hmm … where did we hear about that very recently *cough*
  • A number of default Server Core components

Nano is a stripped down OS, truly being incapable of doing anything until you add the functionality

The intended scenarios for Nano usage are in the cloud:

  • Hyper-V compute and storage (Scale-Out File Server)
  • “Born-in-the-cloud” applications, such as Windows Server containers and Hyper-V containers

In theory, a stripped down OS should speed up deployment, make install footprints smaller (we need non-OEM SD card installation support, Microsoft), reduce reboot times, reduce patching (pointless if I reboot just once per month), and reduce the number of bugs and zero day vulnerabilities.

Nano Server sounds exciting, right? But is it another Server Core? Core was exciting back in W2008. A lot of us tried it, and today, Core is used in a teeny tiny number of installs, despite some folks in Redmond thinking that (a) it’s the best install type and (b) it’s what customers are doing. They were and still are wrong. Core was a failure because:

  • Admins are not prepared to use it
  • The need to have on-console access

We have the ability add/remove a UI in WS2012 but that system is broken when you do all your updates. Not good.

As for troubleshooting, Microsoft says to treat your servers like cattle, not like pets. Hah! How many of you have all your applications running across dozens of load balanced servers? Even big enterprise deploys applications the same way as an SME: on one to a handful of valuable machines that cannot be lost. How can you really troubleshoot headless machines that are having networking issues?

On the compute/storage stack, almost every issue I see on Windows Server and Hyper-V is related to failures in certified drivers and firmwares, e.g. Emulex VMQ. Am I really expected to deploy a headless OS onto hardware where the HCL certification has the value of a bucket with a hole in it? If I was to deploy Nano, even in cloud-scale installations, then I would need a super-HCL that stress tests all of the hardware enhancements. And I would want ALL of those hardware offloads turned OFF by default so that I can verify functionality for myself, because clearly, neither Microsoft’s HCL testers nor the OEMs are capable of even the most basic test right now.

Summary

In my opinion, the entry of containers into Windows Server and Hyper-V is a huge deal for larger customers and cloud service providers. This is true innovation. As for Nano, I can see the potential for cloud-scale deployments, but I cannot trust the troubleshooting-incapable installation option until Microsoft gives the OEMs a serous beating around the head and turns off hardware offloads by default.

Microsoft News – 8 April 2015

There’s a lot of stuff happening now. The Windows Server vNext Preview expires on April 15th and Microsoft is promising a fix … the next preview isn’t out until May (maybe with Ignite on?). There’s rumours of Windows vNext vNext. And there’s talk of open sourcing Windows – which I would hate. Here’s the rest of what’s going on:

Hyper-V

Windows Server

Windows Client

Azure