Building Continuously Available Systems with Hyper-V

The speaker is Brian Dewey of Microsoft.

If you came to this post because it is about Hyper-V, then I really urge you to read the other “Building Continuously …” session notes that I have taken. They all build to this session (it was a track of sessions).

Continuously available:software and hardware are designed to support transparent failover without service or data loss.

Continuous Availability Improvements

Live Migration: Move a VM with zero downtime. Now we can LM VMs inside of clusters as well as between clusters.

Live Storage Migration: Move the VM storage with no downtime between hosts. First the VHDs and config files are copied from one location to another (while the VM is running). IO is mirrored to both locations – stays in that state while LM of the VM state happens. Once we’re in sync, the VM starts running on the destination location with just I/O running over there. If there’s a failure in the workflow, nothing is lost and the VM resumes on the source location. More flexibility for maintenance, host migrations, etc.

Guest clustering: now high end storage customers can use virtual fibre channel HBAs to create failover clusters using VMs. This allows a legacy service running in the VM to become highly available so a VMs OS can be maintained or fail with no service downtime.

Hyper-V Replica: Maintain a warm standby disaster recovery site with asynchronous replication. At high level, configure any running VM to replicate to a remote host. Perform an initial replication of all content. Once that’s done, Hyper-V tracks changes to the VM. The changes are shipped on a scheduled bases to the remote location to update it. This is optimised for high latency WAN and DR. Initial replication can be huge so it can be done out of band using USB drive. Loose coupling of source and destination: use certificates to replicate to a Hyper-V host in a different AD forest or company, e.g. hosting company. It’s “warm standby” because the administrator initiates the failover – might be one for PSH or System Center Orchestrator to bring up lots of VMs in specific order.

Consolidation Magnifies the Effect of Failure

Virtualisation puts more eggs in one basket: fewer servers and less storage systems.

How to Build the Right Solution with Hyper-V?

Continuously available Hyper-V systems require shared storage. W2008 R2 requires SAN. Windows 8 now adds Remote File Servers, Storage Spaces, and Clustered PCI RAID to the mix.

VHDX

Supports up to 16 TB, which all but eliminates need to use inflexible passthrough disk for scalability
Aligns to megabyte boundaries for large sector disk performance
Customers can embed meta data in VHDs – server applications likely to do this.
VHDX will be the default format going forward. Does not support anything earlier than Windows 8 developer preview release.

Offloads

ODX: offline data transfer where SAN does copy work directly instead of involving slower server. See previous notes on ODX token. Note that ODX makes creation of VHDX happen more quickly, so ODX is more than just data transfer.
Trim: freed up space in a disk can be returned to the storage system – thin provisioning.

Demo:

Creates a large VHDX and it is created in a few seconds. It is not dynamic. It is a fully allocated, zeroed out disk. ODX makes this possible.

Hyper-V and SMB

We now know that file share storage of VMs is now supported. You get Live Migration and planned/unplanned failover. Can cluster the file server for HA and scalability. Cross-cluster LM requires remote file shares, even if only transient. Requirements:

SMB 2.2
Remote VSS for host based backup

Storage Spaces

See previous notes. It provides thin provisioning and resiliency. Mirror and parity spaces deliver resilience to physical storage failures.

PCI RAID

Resiliency to node failure as LUN is switched to the failover node. Resiliency to disk failure through RAID.

Continuously Available Networking

NIC teaming is in the box for network path fault tolerance. NIC teaming works in the root and in the guest VM (2 NICs, connecting to 2 virtual switches, each on different pNICs).

Scalable Networking

Get concurrent live migrations with 10 GbE. Hyper-V can use RDMA in the parent partition for efficient file access. Hyper-V hosts can use network offloads. Hyper-V can utilise SR-IOV on capable NICs to optimize VM networking.

Note: SR-IOV bypasses the virtual switch, so any extensions or configurations you’d have on a virtual switch are no longer applicable.

Note: I’m sure Cisco’s extension offers a SR-IOV option.

Modern Server Hardware

Going from up to 64 logical processors to up to 160 LPs.
Physical NUMA topology projected into the guest. Big issue with more than a few vCPUs in a guest on multi-CPU hosts.
Fault containment: H/W memory errors confined to the affected virtual machine. This is a feature of some modern processor. If an error happens in pRAM that is only used by a VM, then only that VM needs to shut down.

Jose Barreto comes up to do a demo. Two hosts. 1 Ethernet and 1 Infiniband NIC. 1 of each switch type connecting to 2 file servers – 1 Ethernet and 1 Infiniband each on the front end. Each file server has 2 SAS HBAs meshed to 2 JBODs.

The Hyper-V hosts use \<cluster-name> to access VM files on the file share, not \<server-name>. The file servers are using storage pools. Instead of IQN or WWN, we grant permission to the file shares to the Hyper-V hosts’ computer accounts. The cluster has no cluster storage: all file shares. In the HA VM properties, you can see the VHDX is stored in \<cluster-name>VMFolder. That share is in a volume that is in a Storage Space. He’s pumping 2.6 Gbps of data throughput to the VHDX from within the VM. Using high speed NICs and RDMA with multiple connections.

Next up: a demo of a transparent failover of the file share on the clustered file servers. This is while huge throughput is happening. We get a drop in IO because it is being cached. The cluster witness tells the client to redirect after the failover so there is no timeout, cache purges, and IO continues as normal with no loss.

2 thoughts on “Building Continuously Available Systems with Hyper-V”

Does the smb 2.2 support infiniband rdma and can windows server 8 iscsi target be configured as an SRP target ? if so will you show some tutorials as this would be great feature if it could use IPoIB instead of SRP

Aidan Finn says:

February 18, 2012 at 5:28 AM

It looks like it does. I don’t have infiniband so I cannot cover it.

Reply

2 thoughts on “Building Continuously Available Systems with Hyper-V”

Leave a Reply Cancel reply