In a modern data centre, there is more and more resource centralization happening. Take a Microsoft cloud deployment for example, such as what Microsoft does with CPS or what you can do with Windows Server (and maybe System Center). A chunk of a rack can contain over a petabyte of RAW storage in the form of a Scale-Out File Server (SOFS) and the rest of the rack is either hosts or TOR networking. With this type of storage consolidation, we have a challenge: how do we ensure that each guest service gets the storage IOPS that it requires?
From a service providers perspective:
- How do we provide storage performance SLAs?
- How do we price-band storage performance (pay more to get more IOPS)?
Up to now with Hyper-V you required a SAN (such as Tintrí) to do some magic on the backend. WS2012 R2 Hyper-V added a crude storage QoS method (maximum rule only) that was performed on at the host and not at the storage. So:
- There was no minimum or SLA-type rule, only a cap.
- QoS rules were not distributed so there was no accounting on host X what Hosts A-W were doing to the shared storage system.
Windows Server vNext is adding Distributed Storage QoS that is the function of a partnership between Hyper-V hosts and a SOFS. Yes: you need a SOFS – but remember that a SOFS can be 2-8 clustered Window Servers that are sharing a SAN via SMB 3.0 (no Storage Spaces in that design).
Note: the hosts use a new protocol called MS-SQOS (based on SMB 3.0 transport) to partner with the SOFS.
Distributed Storage QoS is actually driven from the SOFS. There are multiple benefits from this:
- Centralized monitoring (enabled by default on the SOFS)
- Centralized policy management
- Unified view of all storage requirements of all hosts/clusters connecting to this SOFS
Policy (PowerShell – System Center vNext will add management and monitoring support for Storage QoS) is created on the SOFS, based on your monitoring or service plans. An IO Scheduler runs on each SOFS node, and the policy manager data is distributed. The Policy Manager (a HA cluster resource on the SOFS cluster) pushes (MS-SQOS) policy up the Hyper-V hosts where Rate Limiters restrict the IOPS of virtual machines or virtual hard disks.
There are two kinds of QoS policy that you can create:
- Single-Instance: The resources of the rule are distributed or shared between VMs. Maybe a good one for a cluster/service or a tenant, e.g. a tenant gets 500 IOPS that must be shared by all of their VMs
- Multi-Instance: All VMs/disks get the same rule, e.g. each targeted VM gets a maximum of 500 IOPS. Good for creating VM performance tiers, e.g. bronze, silver, gold with each tier offering different levels of performance for an individual VM
You can create child policies. Maybe you set a maximum for a tenant. Then you create a sub-policy that is assigned to a VM within the limits of the parent policy.
Note that some of this feature comes from the Predictable Data Centers effort by Microsoft Research in Cambridge, UK.
Hyper-V storage PM, Patrick Lang, presented the topic of Distributed Storage QoS at TechEd Europe 2014.