Speaker: Ben Armstrong
Almost everyone in the room using Hyper-V. Large number also using VMware. About 1/3 using public cloud. Maybe 20% doing hybrid cloud.
Microsoft believes that hybrid cloud is the endpoint – seamless movement between on-premises and the public cloud.
Hyper-V scales. Azure runs on stock Hyper-V. It required a lot of work for WS2012, but it’s stock Hyper-V and that’s over 1 million servers running Hyper-V. If 1 in 10,000 installs shows a bug, and you run a hypervisor on that many host deploying 500m VM per day, then you test the product heavily. We benefit from this with our on-premises deployment.
What have Microsoft learned from Azure: Standardize your build – Keep the hosts simple and standardized. Don’t vary. Change does not scale.
Private Cloud Improvements
- Large scale VMs and clusters
- Accelerated live migration
- Dynamic memory with hot add
- Comprehensive host and guest clustering support
- Rolling upgrades
- Mixed mode cluster support
- VM compute resiliency
- Cluster-aware updating
- Broad linux distro support
- In-guest vRSS support
- hot add and online resize virtual disk storage
- Live backup
- Comprehensive management
Hybrid Cloud is about extending your data centre, not replace it. In the MSFT Cloud OS, that’s Hyper-V, with SysCtr/WAP for private cloud, and Azure/partner run hosting cloud for public cloud. MSFT makes it seamless.
Right now, only Microsoft is listed as a leader in 4 categories of hybrid cloud computing by Gartner.
Linux and Windows parity on Hyper-V
Run Linux without compromises on a single host: Hyper-V. you don’t have to partition hosts. A single UI for managing Linux. Backup, monitoring, capacity planning, etc. All too often, the Linux people want to run their own virtualization, and it makes no sense. It’s a waste of time, effort, and importantly, money.
Yes, Hyper-V is supported in OpenStack. And it’s supported in something called Vagrant. Microsoft has been working closely with them.
Only company offering on-premises IaaS, public IaaS, public PaaS, and Public SaaS.
People are running more VMs on:
- More hardware
- Less hardware
Hmm! How we scale is different now. Half a rack can run thousands of VMs. And in hyper scale clouds, you see a lower density for cost effectiveness and performance SLA. In private cloud, we focus on smaller clusters.
Virtualization is now assumed. Physical is no longer the default.
Workload mobility is assumed: People expect Live Migration or vMotion.
Secure isolation is assumed. Customers in different VMs expect that they are secure from other tenants’ VMs.
Hardware failure fault tolerance is assumed.
“I am the fabric administrator”. This is a new job title for the person who runs virtualization, network, and storage. What happens inside the VMs is not their worry. MSFT hearing from businesses that they want fabric admins have no access to data in the VMs. No solution to that today. In contradiction to this, that person used to be the domain admin that fixed everything. But now, it’s not uncommon that they don’t have sign-in credentials for the tenants’ VMs and cannot provide support.
Cluster Rolling Upgrades
Hyper-V upgrades are frequent. Downtime is hated by admins and tenants alike. Admins want to hide the fact that an upgrade is happening. This new process allows mixed mode clusters and Live Migration so you can rebuild nodes in a cluster with a new OS and LM VMs around without anyone noticing. Yes: you keep the cluster – it’s a host rebuild within the cluster and not a cluster migration of the past.
Hyper-V failure are nearly always caused by hardware, drivers, firmware by OEMs. Big area of investment for Microsoft, including transient failures.
I know that this has been a focus point for Ben. Hyper-V is decoupling VM backup from the underlying storage. File based backup is the way forward, with efficient change tracking for backup. Provides reliability, scale, and performance. This session is on right now (Taylor Brown) so watch the recording in 24 hours.
Many more changes
- Delayed VM upgrade
- New IC servicing model
- Secure boot for Linux Generation 2 VMs
- Distributed Storage QoS
- Resilienvt VM Configuration
- And more.
Demo: Compute Resiliency
Clustering saves people over and over. But clustering is complex and it can break. Often caused by a transitory error, such as a cable being unplugged, etc. When there is a heartbeat failure, then you get a 30 second outage while VMs are failed over, and then there’s a wait time for the VMs to boot.
Ben demos with 3 nodes. A script will kill the cluster service on one of the nodes. In 2012 R2, the cluster would panic and do a failover. In vNext, the server is marked as isolated – there’s a problem. VMs are still “running” but market as unmanaged. A failover won’t happen immediately in case the node comes back online. The wait time is 4 minutes by default, but it is configurable. This behaviour is only applied to running VMs.
Another new feature is quarantine. When a host is frequently going in and out of isolated state, then it will be quarantined. It’s a disruptive server that causes a lot of churn. It is quarantined. VMs are migrated off (green quarantine) and then moved into red quarantine. Now it’s persona non-grata (no new workloads placed there) until you resolve the intermittent issue. There is a time for automatic quarantine so a host can come out of quarantine automatically.
Microsoft Were The First to Do Lots in Virtualization
- Hardware assisted live migration for balzing performance.
- SR-IOV with Live Migration
- Fibre Channel in VMs with Live Migration.
- TRIM and UNMAP
Is VMware really the market leader and inniovator?
Ben goes into Q&A.
Question: Is Hyper-V Manager going away? No. Emphatically. It’s used even by the happiest SysCtr and fabric controller admins, especially when things go wrong.
That’s a wrap!