Storage Spaces

DataON Gets Over 1 Million IOPS using Storage Spaces With A 2U JBOD

I work for a European distributor of DataON storage. When Storage Spaces was released with WS2012, DataON was one of the two leading implementers, and to this day, despite the efforts of HP and Dell, I think DataON gives the best balance of:

Performance
Price
Stability
Up-to-date solutions

A few months ago, DataON sent us a document on some benchmark work that was done with their new 12 Gb SAS JBOD. Here are some of the details of the test and the results.

Hardware

DNS-2640D (1 tray) with 24 x 2.5” disk slots
Servers with 2x E5-2660v3 CPUs, 32 GB RAM, 2 x LSI 9300-8e SAS adapters, and 2 x SSDs for the OS – They actually used the server blades from the CiB-9224, but this could have been a DL380 or a Dell R7x0
Windows Server 2012 R2, Build 9600
MPIO configured for Least Blocks (LB) policy
24 x 400GB HGST 12G SSD

A single pool was created. Virtual disks were created as follows:

Test Results

IOMeter was run against the aggregate storage in a number of different scenarios. The results are below:

The headline number is 1.1 million 4K reads per second. But even if we stick to 8K, the JBOD was offering 700,000 reads or 300,000 writes.

I bet this test rig cost a fraction of what the equivalent performing SAN would!

Technorati Tags: Windows Server 2012 R2,Storage Spaces,Storage,Hardware

Microsoft News – 28 September 2015

Wow, the year is flying by fast. There’s a bunch of stuff to read here. Microsoft has stepped up the amount of information being released on WS2016 Hyper-V (and related) features. EMS is growing in terms of features and functionality. And Azure IaaS continues to release lots of new features.

Hyper-V

The Mysterious Case of Infrequent Network Connectivity Issues on 2 Hyper-V VMs Out of 40 Guests: Some PowerShell troubleshooting.
Storage Migration “Operation not allow for VM because Hyper-V State is yet to be initialized from Virtual Machine Configuration” (0x80070015): A fix or workaround to solve this problem.
Missing Hyper-V VMs on Windows 10 – Build 10547: A bug in this insider build hides non v5.0 VMs.
Hyper-converged with Windows Server 2016: Run your storage (Storage Spaces Direct, not SOFS) and Hyper-V cluster on the same hardware.
Configuring site awareness for your multi-active disaggregated datacenters: Solving one of the biggest issues with using Failover Clustering as a DR solution.
Introduction to Linux and FreeBSD on Hyper-V: The Linux Integration Services (LIS). Note that LIS 4.0 can be installed on some distros that have previous versions of LIS built in.
Running Hyper-V on Nano in Windows Server 2016 TP3: For you large scale customers that can afford to lose a host.
Running Nano from Windows Server 2016 TP3 on Hyper-V: Teeny tiny VMs for running born-in-the-cloud services.
Do you know how fast your disks are? I’ve seen expensive customer deployments with over-specced hosts and sh1te SANs where one ordinary VM could effectively shut down the cluster because the SAN couldn’t keep up.
Virtual Machine Storage Resiliency in Windows Server 2016: A very nice resiliency feature from the Hyper-V storage folks.

Windows Client

Windows 10 servicing and deployment guidance now available: Learn about Windows as a Service.
Introduction to Windows 10 servicing: How enterprises can keep devices current with the latest feature upgrades, make better use of Windows Update, and what the new servicing options mean for support lifecycles.
Windows 10 – Reducing the disk footprint: how to deploy Windows 10 with the new smaller installation footprint – great for machines with smaller storage (tablets and ultrabooks)

Azure

Azure Backup – Announcing general availability of backup for Azure IaaS VMs: This vital service for VMs has gone GA with a number of improvements.
Azure Backup of IaaS VMs Generally Available: A post I wrote for Petri.com.
Remote Desktop Services DR Solution using ASR – Guidance: A best practices guide from Microsoft.
SharePoint DR Solution using ASR – Guidance: A best practices guide from Microsoft.
Microsoft Exchange DR Solution using ASR – Guidance: A best practices guide from Microsoft.
A VM that Azure Site Recovery helps protect goes into a resynchronization state: Each VHD must be 1023 GB or less.
Source Control Integration in Azure Automation: Manage your automations scripts.
Monitor your Azure RemoteApp environment with OpInsight – Part 1: A nice hack.
Monitor your Azure RemoteApp environment with OpInsight – Part 2: Continuing the hack.
Microsoft showcases the Azure Cloud Switch (ACS): Linux? Linux! Linux?! Yes; Linux.
September update – Azure Preview Portal improvements: Ibiza continues to evolve in the very long preview. This is starting to feel like the GMail beta. There are some good improvements here, but the lateral scale of the UI needs to be chopped by 80%, or we should get an 80″ Surface Hub for free with every Azure subscription.
Use Azure as Virtual DR Site for VMware and Physical Servers: A post I wrote for Petri.com.
August updates to Azure RemoteApp: Some updates.
How does Azure RemoteApp save user data and settings? I really do not like this solution – I prefer to use a virtual file server with domain membership on the VNet because I have some measure of control.
ADFS Publishing via Azure Traffic Manager : Ah yeah, geo-redundant ADFS.

System Center

Near Real-time Performance Data Collection in OMS: Could this be the start of the replacement of SCOM with a cloud service?
Deploy Software Defined Networks using Virtual Machine Manager : A document based on TP3 of System Center 2016.
2nd Edition of Microsoft System Center: Building a Virtualized Network Solution now available and free to download: Featuring Irish MVP, Damian Flynn.
How to Deploy Host Guardian Service using Service Templates in VMM Tech Preview 3: The new WS2016 Hyper-V security feature.

Office 365

The New Office is Here: In case you missed it, Office 2016 launched this week. O365 customers will get is based on which servicing branch they are in. You might not see it for months if you are in the Current Branch for Business (Pro Plus plans default to this).
What’s new in Office 2016: Why should you upgrade, and what should you look for if you have upgrade rights?
Admins—get ready for Office 2016, rollout begins September 22: I’d tweeted several months ago that I’d heard September was the month, so I am not surprised at all.
Announcing the Office 365 Service Trust Portal: STP is a service feature in Office 365 designed to provide deeper information on how Microsoft manages security, compliance and privacy.
Office 365 Import Service—migration to SharePoint Online and OneDrive for Business just became easier: An out-of-band transfer using hard disks, similar to the Azure import service.
Introducing Office 365 Planner: A “simple and highly visual way to organize teamwork”.
Step-By-Step – Setting up AD FS and Enabling Single Sign-On to Office 365: Set up single sign-on.
How to use your Office 365 subscription with Azure RemoteApp: There’s a new “with O365” template that you can use now. Use your ProPlus installation rights in Azure’s RDS.

EMS

Coming soon – New features for managing Windows 10 and iOS devices: New features between September 25 and October 5.
Day Zero Support for iOS 9 with Intune: Manage your iPhones and iPads.
Azure Active Directory B2B collaboration: Inter-company federation via Azure AD.
Azure AD Premium Dashboard is in preview: Microsoft has turned on the Azure Active Directory Premium Dashboard – available in North America only at this point.
Azure AD B2C and B2B are now in Public Preview:
Learn all about the Azure AD B2B Collaboration Preview: Learn more about the Biz-2-Biz preview.

Security

Tracking a Bluetooth Skimmer Gang in Mexico: A whole new way of skimming your card at the ATM, thanks to the ATM engineers.
Tracking Bluetooth Skimmers in Mexico, Part II: More on this story.
Who’s Behind Bluetooth Skimming in Mexico? A very professional criminal operation.

Miscellaneous

Microsoft heads back to court on September 9th: A post I wrote about the Irish datacenter and US warrant appeal.

Technorati Tags: Microsoft,Hyper-V,Virtualisation,Failover Clustering,Storage,Storage Spaces,Linux,Windows 10,Azure,Cloud,Cloud Computing,Hybrid Cloud,Backup,Remote Desktop Services,DR,Active Directory,System Center,VMM,Security,Office,Office 365,Licensing,EMS,Intune

Software-Defined Storage Calculator and Design Considerations Guide

Microsoft has launched an Excel-based sizing tool to help you plan Storage Spaces (Scale-Out File Server) and guidance on how to design your Storage Spaces deployments.

Here’s the sizing for a very big SOFS that will require 4 x SOFS server nodes and 4 x 60 disk JBODs:

The considerations guide will walk you through using the sizing tool.

Some updates are required – some newer disk sizes aren’t included – but this is a great starting point for a design process.

Technorati Tags: Windows Server 2012 R2,Storage Spaces,Storage,Hyper-V,Virtualisation

Microsoft News 02-June-2015

The big news of the last 24 hours is that Windows 10 will be released on July 29th. I posted before The Verge, etc, that I will be away and not reporting on the release on that date.

Hyper-V

How Shared VHDX Works on Server 2012 R2: V2.0 of SVHDX files is in WS2016.
MVP Carsten Rachfahl interviews Claus Joergenson about Storage Spaces Direct: Video interview recorded at Ignite by the Hyper-V MVP.

Windows Server

Microsoft to hike by 13 percent its user client-access license prices as of August 1: Buy now if you can. CALs are by far the most expensive part of a server purchase for mid-large companies.

Windows Client

And the Release Date for Windows 10 Is… July 29th.
Windows 10 System Requirements: No surprises here.
Will your PC run Windows 10? Use the official compatibility checker to find out
Which Version You Get For Free: You can upgrade from Windows 7 SP1 or Windows 8.1 Update 2. Enterprise editions won’t get the free upgrade offer (requires SA or purchase).
Azure AD Join on Windows 10 devices: The benefits, process, and management of devices which are joined to Azure AD.

Azure

New Azure Infrastructure Services Implementation Guidelines: Start with Azure Infrastructure Services Implementation Guidelines if you are investigating the deployment of your first production IT workload in Azure.
Managing Azure Storage from the Command-Line: Want to delete a previously used storage account? Save half a day and learn a couple of cmdlets. It’s almost like they designed the GUI to force you into it 😉
Recent Enhancements to Azure Site Recovery: The ASR team has been busy, including adding support for Generation 2 VMs, failback to new sites (there now), vSphere and physical servers.

Miscellaneous

13 Things System Administrators Hate About IT Vendors: You’re not alone. As an admin I loved making sales people squirm.
A very interesting infographic from Microsoft: Quietly released as “SMB Mentor Project” this PDF is appears to be aimed at SMB resellers, and I love the tone of it – it’s how I’ve taken to talking down to those who shrug their shoulders at XP and W2003 usage by their customers.

Technorati Tags: Microsoft,Hyper-V,Virtualisation,Storage,Storage Spaces,Windows Server 2016,Licensing,Windows 10,Azure,Active Directory,DR

Microsoft News – 25-May-2015

It’s taken me nearly all day to fast-read through this lot. Here’s a dump of info from Build, Ignite, and since Ignite. Have a nice weekend!

Hyper-V

What’s New in Windows Server 2016 Hyper-V: A post I wrote for Petri.com
Linux Integration Services 4.0 Announcement: As part of this release Microsoft has expanded the access to Hyper-V features and performance to the latest Red Hat Enterprise Linux, CentOS, and Oracle Linux versions.
Linux Integration Services Version 4.0 for Hyper-V: New version of LIS supported Linux distros.
Windows Server 2016 Failover Cluster Troubleshooting Enhancements – Active Dump: This enhancement has significant advantages when you are troubleshooting and getting memory.dmp files from servers running Hyper-V
What’s New in Windows Server Hyper-V: The official statement from Microsoft.
PowerShell Direct – Running PowerShell inside a virtual machine from the Hyper-V host: I wish I had this feature for my WS2012 R2 demos at Ignite.

Windows Server

How to Install Windows Server 2016 Nano in a VM: How to install Nano, the refactored minimal install option for Windows Server designed for cloud apps and micro services.
How Many CSVs Should a Scale-Out File Server Have? A post I wrote for Petri.com
Local Administrator Password Solution (LAPS): The “Local Administrator Password Solution” (LAPS) provides management of local account passwords of domain joined computers. Passwords are stored in Active Directory (AD) and protected by ACL, so only eligible users can read it or request its reset.
Security Thoughts: Microsoft Local Administrator Password Solution (LAPS, KB3062591): A step by step guide.
Storage Spaces Direct: Get the story straight from the PM of this new storage architecture, S2D.
Next-generation storage for the software-defined datacenter: Hear from the boss of the PM of S2D.
Windows Server 2016 Failover Cluster Troubleshooting Enhancements – Cluster Log: This log is valuable for Microsoft’s support as well as those out there who have expertise at troubleshooting failover clusters.
What’s new in SMB 3.1.1 in the Windows Server 2016 Technical Preview 2: In this blog post, you’ll see what changed with the new version of SMB that comes with the Windows 10 Insider Preview released in late April 2015 and the Windows Server 2016 Technical Preview 2 released in early May 2015.
How to display ipconfig on Nano Server every time it boots: At least you’d have some indication that the thing has powered up!
Data Deduplication in Windows Server Technical Preview 2: See what’s currently in the works for WS2016.
What’s new in Windows Server 2016 Technical Preview 2: A subset of features.

Windows Client

Windows 10 and Azure Active Directory – Embracing the Cloud: Since Windows 10’s capabilities to leverage AAD are now starting to appear in the Windows 10 preview builds, this is a great time to explore them in more detail
Azure AD on Windows 10 Personal Devices: How to use Windows 10 with both a personal and a work account at the same time.
Windows 10 Security: Microsoft Passport and Virtual Secure Mode – the latter is based on Hyper-V Shielded VM technology.

System Center

Announcing the availability of System Center 2012 R2 Configuration Manager SP1 and System Center 2012 Configuration Manager SP2: Now generally available and can be downloaded on the Microsoft Evaluation Center. These service packs deliver full compatibility with existing features for Windows 10 deployment, upgrade, and management. Also included in these service packs are new hybrid features for customers using System Center Configuration Manager integrated with Microsoft Intune to manage devices.
Microsoft System Center Operations Manager Management Pack for Windows Server Storage Spaces 2012 R2: Finally!
Windows Azure Pack Websites V2 Update Rollup 6: This release introduces a new MMC application that can be used for install, upgrade and configuration.
Announcing the Microsoft Azure Stack: The new cloud solution from Microsoft, replacing Widows Azure Pack.
Azure Stack – What’s new and what’s changed: A nice description of Azure Stack.
Update Rollup 6 for System Center 2012 R2 Virtual Machine Manager is now available: And allegedly has bugs – I’m stunned!

Azure

Disaster Recovery to Azure enhanced, and we’re listening: Support for Generation 2 VMs, vSphere, physical servers, and more changes coming.
ASR Now Supports NetApp Private Storage for Microsoft Azure: Replicate on-premises NetApp SAN to an NetApp Private Storage appliance in Azure, and orchestrate it using Azure Site Recovery.
New features and innovation: Azure Resource Manager allows deployment of IaaS from JSON templates
Microsoft Azure Marketplace – new features and enhancements: Making the store bigger and more interactive
Introducing App Service Environment: A new scalable and secure environment for Web Apps, spanning the worlds of IaaS and PaaS.
IaaS Just Got Easier. Again: I don’t know about “easier” but JSON templates should be quicker – once you learn this language and syntax.
Build 2015 Azure Storage Announcements! Premium Storage is GA. Azure Files has support.
Azure shines bright at Ignite! A host of announcements from Ignite, including Azure DNS.
Application-Aware Availability Solutions with Azure Site Recovery: Azure Site Recovery solutions have been tested and are now supported for SharePoint, Dynamics AX, Exchange 2013 (single server – AF), Remote Desktop Services, SQL Server (non-replica/cluster members – AF), IIS applications and System Center family like Operations Manager.
Azure Automation – New Graphical and Textual Authoring Features: Revealing Automation through the Ibiza Preview Portal.
Azure Site Recovery at Ignite 2015: The breakout sessions that featured ASR.
Avoid Running Out of Azure Open Credits: A post by me on Petri.com
5 Things That Would Improve Microsoft Azure: A post by me on Petri.com
System Center Orchestrator Migration Toolkit: A collection of tools for migrating integration packs, standard activities, and runbooks from System Center 2012 – Orchestrator to Azure Automation and Service Management Automation.
Azure Cloud App Discovery GA and our new Privileged Identity Management service: How Azure AD can help you and your organization detect and manage risk related to access to cloud resources.
RDP to a Linux Server in Azure: Very handy guide on how to get onto GUI of a Linux VM.
April updates to Azure RemoteApp: Plenty of updates here, some which made the service much easier to use.

Office 365

Modern productivity–Office news at Ignite: What the Office group had to launch at the conference.
Manage change and stay informed in Office 365: Microsoft is adding a new “Select people” option to First Release.

Intune

Announcing support for Windows 10 management with Microsoft Intune: Microsoft announced that Intune now supports the management of Windows 10. All existing Intune features for managing Windows 8.1 and Windows Phone 8.1 will work for Windows 10.
Announcing the Mobile Device Management Design Considerations Guide: If you’re an IT Architect or IT Professional and you need to design a mobile device management (MDM) solution for your organization, there are many questions that you have to answer prior to recommending the best solution for the problem that you are trying to solve. Microsoft has many new options available to manage mobile devices that can match your business and technical requirements.
Mobile Application Distribution Capabilities in Microsoft Intune: Microsoft Intune allows you to upload and deploy mobile applications to iOS, Android, Windows, and Windows Phone devices. In this post, Microsoft will show you how to publish iOS apps, select the users who can download them, and also show you how people in your organization can download these apps on their iOS devices.
Microsoft Intune App Wrapping Tool for Android: Use the Microsoft Intune App Wrapping Tool for Android to modify the behavior of your existing line-of-business (LOB) Android apps. You will then be able to manage certain app features using Intune without requiring code changes to the original application.

Licensing

Disaster Recovery Rights and License Mobility: Fail-over server rights do not apply in the case of software moved to shared third party servers under License Mobility through Software Assurance.

Miscellaneous

Microsoft Ignite 2015 Keynote Highlights for IT Pros: A post I wrote for Petri.com
Getting Started with Microsoft Operations Management Suite: I don’t know if this falls under Azure or not!
Populating machines in Microsoft Operations Management Suite: The next step after getting started.
Protecting your datacenter and cloud from emerging threats: This level 100 post covers a number of areas.

Technorati Tags: Windows Server 2012 R2,Windows Server 2016,Hyper-V,Virtualisation,Storage,Linux,Storage Spaces,PowerShell,Scripting,Deployment,Security,Networking,Windows 10,Active Directory,Azure,Hybrid Cloud,Cloud,Cloud Computing,System Center,Configuration Manager,VMM,Azure Stack,DR,Office 365,Intune,Licensing

Setting Up WS2016 Storage Spaces Direct SOFS

In this post I will show you how to set up a Scale-Out File Server using Windows Server 2016 Storage Spaces Direct (S2D). Note that:

I’m assuming you have done all your networking. Each of my 4 nodes has 4 NICs: 2 for a management NIC team called Management and 2 un-teamed 10 GbE NICs. The two un-teamed NICs will be used for cluster traffic and SMB 3.0 traffic (inter-cluster and from Hyper-V hosts). The un-teamed networks do not have to be routed, and do not need the ability to talk to DCs; they do need to be able to talk to the Hyper-V hosts’ equivalent 2 * storage/clustering rNICs.
You have read my notes from Ignite 2015
This post is based on WS2016 TPv2

Also note that:

I’m building this using 4 x Hyper-V Generation 2 VMs. In each VM SCSI 0 has just the OS disk and SCSI 1 has 4 x 200 GB data disks.
I cannot virtualize RDMA. Ideally the S2D SOFS is using rNICs.

Deploy Nodes

Deploy at least 4 identical storage servers with WS2016. My lab consists of machines that have 4 DAS SAS disks. You can tier storage using SSD or NVMe, and your scalable/slow tier can be SAS or SATA HDD. There can be a max of tiers only: SSD/NVMe and SAS/SATA HDD.

Configure the IP addressing of the hosts. Place the two storage/cluster network into two different VLANs/subnets.

My nodes are Demo-S2D1, Demo-S2D2, Demo-S2D3, and Demo-S2D4.

Install Roles & Features

You will need:

File Services
Failover Clustering
Failover Clustering Manager if you plan to manage the machines locally.

Here’s the PowerShell to do this:

Add-WindowsFeature –Name File-Services, Failover-Clustering –IncludeManagementTools

You can use -ComputerName <computer-name> to speed up deployment by doing this remotely.

Validate the Cluster

It is good practice to do this … so do it. Here’s the PoSH code to validate a new S2D cluster:

Create your new cluster

You can use the GUI, but it’s a lot quicker to use PowerShell. You are implementing Storage Spaces so DO NOT ADD ELGIBLE DISKS. My cluster will be called Demo-S2DC1 and have an IP of 172.16.1.70.

New-Cluster -Name Demo-S2DC1 -Node Demo-S2D1, Demo-S2D2, Demo-S2D3, Demo-S2D4 -NoStorage -StaticAddress 172.16.1.70

There will be a warning that you can ignore:

There were issues while creating the clustered role that may prevent it from starting. For more information view the report file below.

What about Quorum?

You will probably use the default of dynamic quorum. You can either use a cloud witness (a storage account in Azure) or a file share witness, but realistically, Dynamic Quorum with 4 nodes and multiple data copies across nodes (fault domains) should do the trick.

Enable Client Communications

The two cluster networks in my design will also be used for storage communications with the Hyper-V hosts. Therefore I need to configure these IPs for Client communications:

Doing this will also enable each server in the S2D SOFS to register it’s A record of with the cluster/storage NIC IP addresses, and not just the management NIC.

Enable Storage Spaces Direct

This is not on by default. You enable it using PowerShell:

(Get-Cluster).DASModeEnabled=1

Browsing Around FCM

Open up FCM and connect to the cluster. You’ll notice lots of stuff in there now. Note the new Enclosures node, and how each server is listed as an enclosure. You can browse the Storage Spaces eligible disks in each server/enclosure.

Creating Virtual Disks and CSVs

I then create a pool called Pool1 on the cluster Demo-S2DC1 using PowerShell – this is because there are more options available to me than in the UI:

New-StoragePool -StorageSubSystemName Demo-S2DC1.demo.internal -FriendlyName Pool1 -WriteCacheSizeDefault 0 -FaultDomainAwarenessDefault StorageScaleUnit -ProvisioningTypeDefault Fixed -ResiliencySettingNameDefault Mirror -PhysicalDisk (Get-StorageSubSystem -Name Demo-S2DC1.demo.internal | Get-PhysicalDisk)

Get-StoragePool Pool1 | Get-PhysicalDisk |? MediaType -eq SSD | Set-PhysicalDisk -Usage Journal

The you create the CSVs that will be used to store file shares in the SOFS. Rules of thumb:

1 share per CSV
At least 1 CSV per node in the SOFS to optimize flow of data: SMB redirection and redirected IO for mirrored/clustered storage spaces

Using this PoSH you will lash out your CSVs in no time:

$CSVNumber = “4”
$CSVName = “CSV”
$CSV = “$CSVName$CSVNumber”

New-Volume -StoragePoolFriendlyName Pool1 -FriendlyName $CSV -PhysicalDiskRedundancy 2 -FileSystem CSVFS_REFS –Size 200GB
Set-FileIntegrity “C:\ClusterStorage\Volume$CSVNumber” –Enable $false

The last line disables ReFS integrity streams to support the storage of Hyper-V VMs on the volumes. You’ll see from the screenshot what my 4 node S2D SOFS looks like, and that I like to rename things:

Note how each CSV is load balanced. SMB redirection will redirect Hyper-V hosts to the owner of a CSV when the host is accessing files for a VM that is stored on that CSV. This is done for each VM connection by the host using SMB 3.0, and ensures optimal flow of data with minimized/no redirected IO.

There are some warnings from Microsoft about these volumes:

They are likely to become inaccessible on later Technical Preview releases.
Resizing of these volumes is not supported.
Oops! This is a technical preview and this should be pure lab work that your willing to lose.

Create a Scale-Out File Server

The purpose of this post is to create a SOFS from the S2D cluster, with the sole purpose of the cluster being to store Hyper-V VMs that are accessed by Hyper-V hosts via SMB 3.0. If you are building a hyperconverged cluster (not supported by the current TPv2 preview release) then you stop here and proceed no further.

Each of the S2D cluster nodes and the cluster account object should be in an OU just for the S2D cluster. Edit the advanced security of the OU and grand the cluster account object create computer object and delete compute object rights. If you don’t do this then the SOFS role will not start after this next step.

Next, I am going to create an SOFS role on the S2D cluster, and call it Demo-S2DSOFS1.

New-StorageFileServer -StorageSubSystemName Demo-S2DC1.demo.internal -FriendlyName Demo-S2DSOFS1 -HostName Demo-S2DSOFS1Create a -Protocols SMB

Create and Permission Shares

Create 1 share per CSV. If you need more shares then create more CSVs. Each share needs the following permissions:
- Each Hyper-V host
- Each Hyper-V cluster
- The Hyper-V administrators
You can use the following PoSH to create and permission your shares. I name the share folder and share name after the CSV that it is stored on, so simply change the $ShareName variable to create lots of shares, and change the permissions as appropriate.

$ShareName = “CSV1”
$SharePath = “$RootPath\$ShareName\$ShareName”

md $SharePath
New-SmbShare -Name $ShareName -Path $SharePath -FullAccess Demo-Host1$, Demo-Host2$, Demo-HVC1$, “Demo\Hyper-V Admins”
Set-SmbPathAcl -ShareName $ShareName

Create Hyper-V VMs

On your hosts/clusters create VMs that store all of their files on the path of the SOFS, e.g. \\Demo-S2DSOFS1\CSV1\VM01, \\Demo-S2DSOFS1\CSV1\VM02, etc.

Remember that this is a Preview Release

This post was written not long after the release of TPv2:
- Expect bugs – I am experiencing at least one bad one by the looks of it
- Don’t expect support for a rolling upgrade of this cluster
- Bad things probably will happen
- Things are subject to change over the next year
Technorati Tags: Windows Server 2016,Hyper-V,Storage Spaces,Storage,Virtualisation,Failover Clustering,PowerShell

Ignite 2015 – Storage Spaces Direct (S2D)

This session, presented by Claus Joergensen, Michael Gray, and Hector Linares can be found on Channel 9:

Current WS2012 R2 Scale-Out File Server

This design is known as converged (not hyper-converged). There are two tiers:

Compute tier: Hyper-V hosts that are connected to storage by SMB 3.0 networking. Virtual machine files are stored on the SOFS (storage tier) via file shares.
Storage tier: A transparent failover cluster that is SAS-attached to shared JBODs. The JBODs are configured with Storage Spaces. The Storage Spaces virtual disks are configured as CSVs, and the file shares that are used by the compute tier are kept on these CSVs.

The storage tier or SOFS has two layers:

The transparent failover cluster nodes
The SAS-attached shared JBODs that each SOFS node is (preferably) direct-connected to

System Center is an optional management layer.

Introducing Storage Spaces Direct (S2D)

Note: you might hear/see/read the term SSDi (there’s an example in one of the demos in the video). This was an old abbreviation. The correct abbreviation for Storage Spaces Direct is S2D.

The focus of this talk is the storage tier. S2D collapses this tier so that there is no need for a SAS layer. Note, though, that the old SOFS design continues and has scenarios where it is best. S2D is not a replacement – it is another design option.

S2D can be used to store VM files on it. It is made of servers (4 or more) that have internal or DAS disks. There are no shared JBODs. Data is mirrored across each node in the S2D cluster, therefore the virtual disks/CSVs are mirrored across each node in the S2D cluster.

S2D introduces support for new disks (with SAS disks still being supported):

Low cost flash with SATA SSDs
Better flash performance with NVMe SSDs

Other features:

Simple deployment – no eternal enclosures or SAS
Simpler hardware requirements – servers + network, and no SAS/MPIO, and no persistent reservations and all that mess
Easy to expand – just add more nodes, and get storage rebalancing
More scalability – at the cost of more CPUs and Windows licensing

S2D Deployment Choice

You have two options for deploying S2D. Windows Server 2016 will introduce a hyper-converged design – yes I know; Microsoft talked down hyper-convergence in the past. Say bye-bye to Nutanix. You can have:

Hyper-converged: Where there are 4+ nodes with DAS disks, and this is both the compute and storage tier. There is no other tier, no SAS, nothing, just these 4+ servers in one cluster, each sharing the storage and compute functions with data mirrored across each node. Simple to deploy and MSFT thinks this is a sweet spot for SME deployments.
Converged (aka Private Cloud Storage): The S2D SOFS is a separate tier to the compute tier. There are a set of Hyper-V hosts that connect to the S2D SOFS via SMB 3.0. There is separate scaling between the compute and storage tiers, making it more suitable for larger deployments.

Hyper-convergence is being tested now and will be offered in a future release of WS2016.

Choosing Between Shared JBODs and DAS

As I said, shared JBOD SOFS continues as a deployment option. In other words, an investment in WS2012 R2 SOFS is still good and support is continued. Node that shared JBODs offer support for dual parity virtual disks (for archive data only – never virtual machines).

S2D adds support for the cheapest of disks and the fastest of disks.

Under The Covers

This is a conceptual, not an architectural, diagram.

A software storage bus replaces the SAS shared infrastructure using software over an Ethernet channel. This channel spans the entire S2D cluster using SMB 3.0 and SMB Direct – RDMA offers low latency and low CPU impact.

On top of this bus that spans the cluster, you can create a Storage Spaces pool, from which you create resilient virtual disks. The virtual disk doesn’t know that it’s running on DAS instead of shared SAS JBOD thanks to the abstraction of the bus.

File systems are put on top of the virtual disk, and this is where we get the active/active CSVs. The file system of choice for S2D is ReFS. This is the first time that ReFS is the primary file system choice.

Depending on your design, you either run the SOFS role on the S2D cluster (converged) or you run Hyper-V virtual machines on the S2D cluster (hyper-converged).

System Center is an optional management layer.

Data Placement

Data is stored in the form of extents. Each extent is 1 GB in size so a 100 GB virtual disk is made up of 100 extents. Below is an S2D cluster of 5 nodes. Note that extents are stored evenly across the S2D cluster. We get resiliency by spreading data across each node’s DAS disks. With 3-way mirroring, each extent is stored on 3 nodes. If one node goes down, we still have 2 copies, from which data can be restored onto a different 3rd node.

Note: 2-way mirroring would keep extents on 2 nodes instead of 3.

Extent placement is rebalanced automatically:

When a node fails
The S2D cluster is expanded

How we get scale-out and resiliency:

Scale-Out: Spreading extents across nodes for increased capacity.
Resiliency: Storing duplicate extents across different nodes for fault tolerance.

This is why we need good networking for S2D: RDMA. Forget your 1 Gbps networks for S2D.

Scalability

Scaling to large pools: Currently we can have 80 disks in a pool. in TPv2 we can go to 240 disks but this could be much higher.
The interconnect is SMB 3.0 over RDMA networking for low latency and CPU utilization
Simple expansion: you just add a node, expand the pool, and the extents are rebalanced for capacity … extents move from the most filled nodes to the most available nodes. This is a transparent background task that is lower priority than normal IOs.

You can also remove a system: rebalance it, and shrink extents down to fewer nodes.

Scale for TPv2:

Minimum of 4 servers
Maximum of 12 servers
Maximum of 240 disks in a single pool

Availability

S2D is fault tolerance to disk enclosure failure and server failure. It is resilient to 2 servers failing and cluster partitioning. The result should be uninterrupted data access.

Each S2D server is treated as the fault domain by default. There is fault domain data placement, repair and rebalancing – means that there is no data loss by losing a server. Data is always placed and rebalanced to recognize the fault domains, i.e. extents are never stored in just a single fault domain.

If there is a disk failure, there is automatic repair to the remaining disks. The data is automatically rebalanced when the disk is replaced – not a feature of shared JBOD SOFS.

If there is a temporary server outage then there is a less disruptive automatic data resync when it comes back online in the S2D cluster.

When there is a permanent server failure, the repair is controlled by the admin – the less disruptive temporary outage is more likely so you don’t want rebalancing happening then. In the event of a real permanent server loss, you can perform a repair manually. Ideally though, the original machine will come back online after a h/w or s/w repair and it can be resynced automatically.

ReFS – Data Integrity

Note that S2D uses ReFS (pronounced as Ree-F-S) as the file system of choice because of scale, integrity and resiliency:

Metadata checksums protect all file system metadata
User data checksums protect file data
Checksum verification occurs on every read of checksum-protected data and during periodic background scrubbing
Healing of detected corruption occurs as soon as it is detected. Healthy version is retrieved from a duplicate extent in Storage Spaces, if available; ReFS uses the healthy version to get Storage Spaces to repair the corruption.

No need for chkdsk. There is no disruptive offline scanning in ReFS:

The above “repair on failed checksum during read” process.
Online repair: kind of like CHKDSK but online
Backups of critical metadata are kept automatically on the same volume. If the above repair process fails then these backups are used. So you get the protection of extent duplication or parity from Storage Spaces and you get critical metadata backups on the volume.

ReFS – Speed and Efficiency

Efficient VM checkpoints and backup:

VHD/X checkpoints (used in file-based backup) are cleaned up without physical data copies. The merge is a metadata operation. This reduces disk IO and increases speed. (this is clever stuff that should vastly improve the disk performance of backups).
Reduces the impact of checkpoint-cleanup on foreground workloads. Note that this will have a positive impact on other things too, such as Hyper-V Replica.

Accelerated Fixed VHD/X Creation:

Fixed files zero out with just a metadata operation. This is similar to how ODX works on some SANs
Much faster fixed file creation
Quicker deployment of new VMs/disks

Yay! I wonder how many hours of my life I could take back with this feature?

Dynamic VHDX Expansion

The impact of the incremental extension/zeroing out of dynamic VHD/X expansion is eliminated too with a similar metadata operation.
Reduces the impact too on foreground workloads

Demo 1:

2 identical VMs, one on NTFS and one on ReFS. Both have 8 GB checkpoints. Deletes the checkpoint from the ReFS VM – The merge takes about 1-2 seconds with barely any metrics increase in PerfMon (incredible improvement). Does the same on the NTFS VM and … PerfMon shows way more activity on the disk and the process will take about 3 minutes.

Demo 2:

Next he creates 15 GB fixed VHDX files on two shares: one on NTFS and one on ReFS. ReFS file is created in less than a second while the previous NTFS merge demo is still going on. The NTFS file will take …. quite a while.

Demo 3:

Disk Manager open on one S2D node: 20 SATA disks + 4 Samsung NVMe disks. The S2D cluster has 5 nodes. There is a total of 20 NVMe devices in the single pool – a nice tidy aggregation of PCIe capacity. The 5 node is new so no rebalancing has been done.

Lots of VMs running from a different tier of compute. Each VM is running DiskSpd to stress the storage. But distributed Storage QoS is limiting the VMs to 100 IOPs each.

Optimize-StoragePool –FriendlyName SSDi is run to bring the 5th node into the cluster (called SSDi) online by rebalancing. Extents are remapped to the 5th node. The system goes full bore to maximize IOPS – but note that “user” operations take precedence and the rebalancing IOPS are lower priority.

Storage Management in the Private Cloud

Management is provided by SCOM and SCVMM. This content is focused on S2D, but the management tools also work with other storage options:

SOFS with shared JBOD
SOFS with SAN
SAN

Roles:

VMM: bare-metal provisioning, configuration, and LUN/share provisioning
SCOM: Monitoring and alerting
Azure Site Recovery (ASR) and Storage Replica: Workload failover

Note: You can also use Hyper- Replica with/without ASR.

Demo:

He starts the process of bare-metal provisioning a SOFS cluster from VMM – consistent with the Hyper-V host deployment process. This wizard offers support for DAS or shared JBOD/SAN; this affects S2D deployment and prevents unwanted deployment of MPIO. You can configure existing servers or deploy a physical computer profile to do a bare-metal deployment via BMCs in the targeted physical servers. After this is complete, you can create/manage pools in VMM.

File server nodes can be added from existing machines or bare-metal deployment. The disks of the new server can be added to the clustered Storage Spaces pool. Pools can be tiered (classified). Once a pool is created, you can create a file share – this provisions the virtual disk, configures CSV, and sets up the file system for you – lots of automation under the covers. The wizard in VMM 2016 includes resiliency and tiering.

Monitoring

Right now, SCOM must do all the work – gathering a data from a wide variety of locations and determining health rollups. There’s a lot of management pack work there that is very hardware dependent and limits extensibility.

Microsoft reimagined monitoring by pushing the logic back into the storage system. The storage system determines health of the storage system. Three objects are reported to monitoring (PowerShell, SCOM or 3rd party, consumable through SMAPI):

The storage system: including node or disk fails
Volumes
File shares

Alerts will be remediated automatically where possible. The system automatically detects the change of health state from error to healthy. Updates to external monitoring takes seconds. Alerts from the system include:

Urgency
The recommended remediation action

Demo:

One of the cluster nodes is shut down. SCOM reports that a node is missing – there isn’t additional noise about enclosures, disks, etc. The subsystem abstracts that by reporting the higher error – that the server is down. The severity is warning because the pool is still online via the rest of the S2D cluster. The priority is high because this server must be brought back online. The server is restarted, and the alert remediates automatically.

Hardware Platforms

Storage Spaces/JBODs has proven that you cannot use just any hardware. In my experience, DataON stuff (JBOD, CiB, HGST SSD and Seagate HDD) is reliable. On the other hand, SSDs by SanDisk are shite, and I’ve had many reports of issues with Intel and Quanta Storage Spaces systems.

There will be prescriptive configurations though partnerships, with defined platforms, components, and configuration. This is a work in progress. You can experiment with Generation 2 VMs.

S2D Development Partners

I really hope that we don’t see OEMs creating “bundles” like they did for pre-W2008 clustering that cost more than the sum of the otherwise-unsupported individual components. Heck, who am I kidding – of course they will do that!!! That would be the kiss of death for S2D.

FAQ

The Importance of RDMA

Demo Video:

They have two systems connected to a 4 node S2D cluster, with a sum total of 1.2 million 4K IOPS with below 1 millisecond latency, thanks to (affordable) SATA SSDs and Mellanox ConnectX-3 RDMA networking (2 x 40 Gbps ports per client). They remove RDMA from each client system. IOPS is halved and latency increases to around 2 milliseconds. RDMA is what enables low latency and low CPU access to the potential of the SSD capacity of the storage tier.

Hint: the savings in physical storage by using S2D probably paid for the networking and more.

Questions from the Audience

DPM does not yet support backing up VMs that are stored on ReFS.
You do not do SMB 3.0 loopback for hyper-convergence. SMB 3.0 is not used … Hyper-V just stores the VMs on the local CSVs of the S2D cluster.
There is still SMB redirecting in the converged scenario, A CSV is owned by a node, with CSV ownership balancing. When host connects to a share, it is redirected to the owner of the CSV, therefore traffic should be balanced to the separate storage tier.
In hyper-convergence, the VM might be on node A and the CSV owner might be on another node, with extents all over the place. This is why RDMA is required to connect the S2D nodes.
Which disk with the required extents do they read from? They read from the disk with the shorted queue length.
Yes, SSD tiering is possible, including write-back cache, but it sounds like more information is yet to be released.
They intend to support all-flash systems/virtual disks

Technorati Tags: Event Notes,Windows Server 2016,Storage,Storage Spaces,Failover Clustering,System Center,VMM

Windows Server Technical Preview – Distributed Storage QoS

In a modern data centre, there is more and more resource centralization happening. Take a Microsoft cloud deployment for example, such as what Microsoft does with CPS or what you can do with Windows Server (and maybe System Center). A chunk of a rack can contain over a petabyte of RAW storage in the form of a Scale-Out File Server (SOFS) and the rest of the rack is either hosts or TOR networking. With this type of storage consolidation, we have a challenge: how do we ensure that each guest service gets the storage IOPS that it requires?

From a service providers perspective:

How do we provide storage performance SLAs?
How do we price-band storage performance (pay more to get more IOPS)?

Up to now with Hyper-V you required a SAN (such as Tintrí) to do some magic on the backend. WS2012 R2 Hyper-V added a crude storage QoS method (maximum rule only) that was performed on at the host and not at the storage. So:

There was no minimum or SLA-type rule, only a cap.
QoS rules were not distributed so there was no accounting on host X what Hosts A-W were doing to the shared storage system.

Windows Server vNext is adding Distributed Storage QoS that is the function of a partnership between Hyper-V hosts and a SOFS. Yes: you need a SOFS – but remember that a SOFS can be 2-8 clustered Window Servers that are sharing a SAN via SMB 3.0 (no Storage Spaces in that design).

Note: the hosts use a new protocol called MS-SQOS (based on SMB 3.0 transport) to partner with the SOFS.

Distributed Storage QoS is actually driven from the SOFS. There are multiple benefits from this:

Centralized monitoring (enabled by default on the SOFS)
Centralized policy management
Unified view of all storage requirements of all hosts/clusters connecting to this SOFS

Policy (PowerShell – System Center vNext will add management and monitoring support for Storage QoS) is created on the SOFS, based on your monitoring or service plans. An IO Scheduler runs on each SOFS node, and the policy manager data is distributed. The Policy Manager (a HA cluster resource on the SOFS cluster) pushes (MS-SQOS) policy up the Hyper-V hosts where Rate Limiters restrict the IOPS of virtual machines or virtual hard disks.

There are two kinds of QoS policy that you can create:

Single-Instance: The resources of the rule are distributed or shared between VMs. Maybe a good one for a cluster/service or a tenant, e.g. a tenant gets 500 IOPS that must be shared by all of their VMs
Multi-Instance: All VMs/disks get the same rule, e.g. each targeted VM gets a maximum of 500 IOPS. Good for creating VM performance tiers, e.g. bronze, silver, gold with each tier offering different levels of performance for an individual VM

You can create child policies. Maybe you set a maximum for a tenant. Then you create a sub-policy that is assigned to a VM within the limits of the parent policy.

Note that some of this feature comes from the Predictable Data Centers effort by Microsoft Research in Cambridge, UK.

Hyper-V storage PM, Patrick Lang, presented the topic of Distributed Storage QoS at TechEd Europe 2014.

Technorati Tags: Hyper-V,Windows Server 2015,Virtualisation,Storage,Storage Spaces,Failover Clustering,Networking

New Features in Windows Server 2016 (WS2016) Hyper-V

I’m going to do my best (no guarantees – I only have one body and pair of ears/eyes and NDA stuff is hard to track!) to update this page with a listing of each new feature in Windows Server 2016 (WS2016) Hyper-V and Hyper-V Server 2016 after they are discussed publicly by Microsoft. The links will lead to more detailed descriptions of each feature.

Note, that the features of WS2012 can be found here and the features of WS2012 R2 can be found here.

This list was last updated on 25/May/2015 (during Technical Preview 2).

Active memory dump

Windows Server 2016 introduces a dump type of “Active memory dump”, which filters out most memory pages allocated to VMs making the memory.dmp file much smaller and easier to save/copy.

Azure Stack

A replacement for Windows Azure Pack (WAPack), bringing the code of the “Ibiza” “preview portal” of Azure to on-premises for private cloud or hosted public cloud. Uses providers to interact with Windows Server 2016. Does not require System Center, but you will want management for some things (monitoring, Hyper-V Network Virtualization, etc).

Azure Storage

A post-RTM update (flight) will add support for blobs, tables, and storage accounts, allowing you to deploy Azure storage on-premises or in hosted solutions.

Backup Change Tracking

Microsoft will include change tracking so third-party vendors do not need to update/install dodgy kernel level file system filters for change tracking of VM files.

Binary VM Configuration Files

Microsoft is moving away from text-based files to increase scalability and performance.

Cluster Cloud Witness

You can use Azure storage as a witness for quorum for a multi-site cluster. Stores just an incremental sequence number in an Azure Storage Account, secured by an access key.

Cluster Compute Resiliency

Prevents the cluster from failing a host too quickly after a transient error. A host will go into isolation, allowing services to continue to run without disruptive failover.

Cluster Functional Level

A rolling upgrade requires mixed-mode clusters, i.e. WS2012 R2 and Windows Server vNext hosts in the same cluster. The cluster will stay and WS2012 R2 functional level until you finish the rolling upgrade and then manually increase the cluster functional level (one-way).

Cluster Quarantine

If a cluster node is flapping (going into & out of isolation too often) then the cluster will quarantine a node, and drain it of resources (Live Migration – see MoveTypeThreshold and DefaultMoveType).

Cluster Rolling Upgrade

You do not need to create a new cluster or do a cluster migration to get from WS2012 R2 to Windows Server vNext. The new process allows hosts in a cluster to be rebuilt IN THE EXISTING cluster with Windows Server vNext.

Containers

Deploy born-in-the-cloud stateless applications using Windows Server Containers or Hyper-V Containers.

Converged RDMA

Remote Direct Memory Access (RDMA) NICs (rNICs) can be converged to share both tenant and host storage/clustering traffic roles.

Delivery of Integration Components

This will be done via Windows Update

Differential Export

Export just the changes between 2 known points in time. Used for incremental file-based backup.

Distributed Storage QoS

Enable per-virtual hard disk QoS for VMs stored on a Scale-Out File Server, possibly also available for SANs.

File-Based Backup

Hyper-V is decoupling from volume backup for scalability and reliability reasons

Host Resource Protection

An automated process for restricting resource availability to VMs that display unwanted “patterns of access”.

Hot-Add & Hot-Remove of vNICs

You can hot-add and hot-remove virtual NICs to/from a running virtual machine.

Hyper-convergence

This is made possible with Storage Spaces Direct and is aimed initially at smaller deployments.

Hyper-V Cluster Management

A new administration model that allows tools to abstract the cluster as a single host. Enables much easier VM management, visible initially with PowerShell (e.g. Get-VM, etc).

Hyper-V Replica & Hot Add of Disks

You can add disks to a virtual machine that is already being replicated. Later you can add the disks to the replica set using Set-VMReplication.

Hyper-V Manager Alternative Credentials

With CredSSP-enabled PCs and hosts, you can connect to a host with alternative credentials.

Hyper-V Manager Down-Level Support

You can manage Windows Server vNext, WS2012 R2 and WS2012 Hyper-V from a single console

Hyper-V Manager WinRM

WinRM is used to connect to hosts.

MS-SQOS

This is a new protocol for Microsoft Storage QoS. It uses SMB 3.0 as a transport, and it describes the conversation between Hyper-V compute nodes and the SOFS storage nodes. IOPS, latency, initiator names, imitator node information is sent from the compute nodes to the storage nodes. The storage nodes, send back the enforcement commands to limit flows, etc.

Nested Virtualization

Yes, you read that right! Required for Hyper-V containers in a hosted environment, e.g. Azure. Side-effect is that WS2016 Hyper-V can run in WS2016 via virtualization of VT-X.

Network Controller

A new fabric management feature built-into Windows Server, offering many new features that we see in Azure. Examples are a distributed firewall and software load balancer.

Online Resize of Memory

Change memory of running virtual machines that don’t have Dynamic Memory enabled.

Power Management

Hyper-V has expanded support for power management, including Connected Standby

PowerShell Direct

Target PowerShell at VMs via the hypervisor (VMbus) without requiring network access. You still need local admin credentials for the guest OS.

Pre-Authentication Integrity

When talking from one machine to the next via SMB 3.1.1. This is a security feature that uses checks on the sender & recipient side to ensure that there is no man-in-the-middle.

Production Checkpoints

Using VSS in the guest OS to create a consistent snapshots that workload services should be able to support. Applying a checkpoint is like performing a VM restore from backup.

Nano Server

A new installation option that allows you to deploy headless Windows Servers with tiny install footprint and no UI of any kind. Intended for storage and virtualization scenarios at first. There will be a web version of admin tools that you can deploy centrally.

RDMA to the Host

Remote Direct Memory Access will be supported to the management OS virtual NICs via converged networking.

ReFS Accelerated VHDX Operations

Operations are accelerated by converting them into metadata operations: fixed VHDX creation, dynamic VHDX extension, merge of checkpoints (better file-based backup).

RemoteFX

OpenFL 4.4 and OpenCL 1.1 API are supported.

Replica Support for Hot-Add of VHDX

When you hot-add a VHDX to a running VM that is being replicated by Hyper-V Replica, the VHDX is available to be added to the replica set (MSFT doesn’t assume that you want to replicate the new disk).

Replica support for Cross-Version Hosts

Your hosts can be of different versions.

Runtime Memory Resize

You can increase or decrease the memory assigned to Windows Server vNext guests.

Secure Boot for Linux

Enable protection of the boot loader in Generation 2 VMs

Shared VHDX Improvements

You will be able to do host-based snapshots of Shared VHDX (so you get host-level backups) and guest clusters. You will be able to hot-resize a Shared VHDX.

Shared VHDX will have its own hardware category in the UI. Note that there is a new file format for Shared VHDX. There will be a tool to upgrade existing files.

Shielded Virtual Machines

A new security model that hardens Hyper-V and protects virtual machines against unwanted tampering at the fabric level.

SMB 3.1.1

This is a new version of the data transport protocol. The focus has been on security. There is support for mixed mode clusters so there is backwards compatibility. SMB 3.02 is now called SMB 3.0.2.

SMB Negotiated Encryption

Moving from AES CCM to AES GCM (Galois Counter Mode) for efficiency and performance. It will leverage new modern CPUs that have instructions for AES encryption to offload the heavy lifting.

SMB Forced Encryption

In older versions of SMB, SMB encryption was opt-in on the client side. This is no longer the case in the next version of Windows Server.

Storage Accounts

A later release of WS2016 will bring support for hosting Azure-style Storage accounts, meaning that you can deploy Azure-style storage on-premises or in a hosted cloud.

Storage Replica

Built-in, hardware agnostic, synchronous and asynchronous replication of Windows Storage, performed at the file system level (volume-based). Enables campus or multi-site clusters.

Requires GPT. Source and destination need to be the same size. Need low latency. Finish the solution with the Cluster Cloud Witness.

Storage Spaces Direct (S2D)

A “low cost” solution for VM storage. A cluster of nodes using internal (DAS) disks (SAS or SATA, SSD, HDD, or NVMe) to create a consistent storage spaces pools that stretch across the servers. Compute is normally on a different cluster (converged) but it can be on one tier (hyper-converged)

Storage Transient Failures

Avoid VM bugchecks when storage has a transient issue. The VM freezes while the host retries to get storage back online.

Stretch Clusters

The preferred term for when Failover Clustering spans two sites.

System Center 2016

Those of you who can afford the per-host SMLs will be able to get System Center 2016 to manage your shiny new Hyper-V hosts and fabric.

System Requirements

The system requirements for a server host have been increased. You now must have support for Second-Level Address Translation (SLAT), known as Intel EPT or AMD RVI or NPT. Previously SLAT (Intel Nehalem and later) was recommended but not required on servers and required on Client Hyper-V. It shouldn’t be an issue for most hosts because SLAT has been around for quite some time.

Virtual Machine Groups

Group virtual machines for operations such as orchestrated checkpoints (even with shared VHDX) or group checkpoint export.

Virtual Machine ID Management

Control whether a VM has same or new ID as before when you import it.

Virtual Network Adapter Identification

Not vCDN! You can create/name a vNIC in the settings of a VM and see the name in the guest OS.

Virtual Secure Mode (VSM)

A feature of Windows 10 Enterprise that protects LSASS (secret keys) from pass-the-hash attacks by storing the process in a stripped down Hyper-V virtual machine.

Virtual TPM (vTPM)

A feature of shielded virtual machines that enables secure boot, disk encrypting within the virtual machine, and VSC.

VM Storage Resiliency

A VM will pause when the physical storage of that VM goes offline. Allows the storage to come back (maybe Live Migration) without crashing the VM.

VM Upgrade Process

VM versions are upgraded manually, allowing VMs to be migrated back down to WS2012 R2 hosts with support from Microsoft.

VXLAN Support

The new Network Controller will support VXLAN as well as the incumbent NVGRE for network virtualization.

Windows Containers

This is Docker in Windows Server, enabling services to run in containers on a shared set of libaries on an OS, giving you portability, per-OS density, and fast deployment.

TEE14 – Software Defined Storage in Windows Server vNext

Speaker: Siddhartha Roy

Software-Defined Storage gives you choice. It’s a breadth offering and unified platform for MSFT workloads and public cloud scale. Economical storage for private/public cloud customers.

About 15-20% of the room has used Storage Spaces/SOFS.

What is SDS? Cloud scale storage and cost economics on standard, volume hardware. Based on what Azure does.

Where are MSFT in the SDS Journey Today?

In WS2012 we got Storage Spaces as a cluster supported storage system. No tiering. We could build a SOFS using cluster supported storage, and present that to Hyper-V hosts via SMB 3.0.

Storage Spaces: Storage based on economical JBOD h/w
SOFS: Transparent failover, continuously available application storage platform.
SMB 3.0 fabric: high speed, and low latency can be added with RDMA NICs.

What’s New in Preview Release

Greater efficiency
More uptime
Lower costs
Reliability at scale
Faster time to value: get customers to adopt the tech

Storage QoS

Take control of the service and offer customers different bands of service.

Enabled by default on the SOFS. 2 metrics used: latency and IOPS. You can define policies around IOPS by using min and max. Can be flexible: on VHD level, VM level, or tenant/service level.

It is managed by System Center and PoSH. You have an aggregated end-end view from host to storage.

Patrick Lang comes on to do a demo. There is a file server cluster with 3 nodes. The SOFS role is running on this cluster. There is a regular SMB 3.0 file share. A host has 5 VMs running on it, stored on the share. One OLTP VM is consuming 8-10K IOPS using IOMETER. Now he uses PoSH to query the SOFS metrics. He creates a new policy with min 100 and max 200 for a bunch of the VMs. The OLTP workload gets a policy with min of 3000 and max of 5000. Now we see its IOPS drop down from 8-10K. He fires up VMs on another host – not clustered – the only commonality is the SOFS. These new VMs can take IOPS. A rogue one takes 2500 IOPS. All of the other VMs still get at least their min IOPS.

Note: when you look at queried data, you are seeing an average for the last 5 minutes. See Patrick Lang’s session for more details.

Rolling Upgrades – Faster Time to Value

Cluster upgrades were a pain. They get much easier in vNext. Take a node offline. Rebuild it in the existing cluster. Add it back in, and the cluster stays in mixed mode for a short time. Complete the upgrades within the cluster, and then disable mixed mode to get new functionality. The “big red switch” is a PoSH cmdlet to increase the cluster functional level.

Cloud Witness

A third site witness for multi-site cluster, using a service in Azure.

Compute Resiliency

Stops the cluster from being over aggressive with transient glitches.

Related to this is quarantine of flapping nodes. If a node is in and out of isolation too much, it is “removed” from the cluster. The default quarantine is 2 hours – give the admin a chance to diagnose the issue. VMs are drained from a quarantined node.

Storage Replica

A hardware agnostic synchronous replication system. You can stretch a cluster with low latency network. You get all the bits in the box to replicate storage. It uses SMB 3.0 as a transport. Can use metro-RDMA to offload and get low latency. Can add SMB encryption. Block-level synchronous requires <5MS latency. There is also an asynchronous connection for higher latency links.

The differences between synch and asynch:

Ned Pyle, a storage PM, comes on to demo Storage Replica. He’ll do cluster-cluster replication here, but you can also do server-server replication.

There is a single file server role on a cluster. There are 4 nodes in the cluster. There is assymetric clustered storage. IE half the storage on 2 nodes and the other half on the other 2 nodes. He’s using iSCSI storage in this demo. It just needs to be cluster supported storage. He right-clicks on a volume and selects Replication > Enable Replication … a wizard pops up. He picked a source disk. Clustering doesn’t do volumes … it does disks. If you do server-server repliction then you can replicate a volume. Picks a source replication log disk. You need to use a GPT disk with a file system. Picks a destination disk to replicate to, and a destination log disk. You can pre-seed the first copy of data (transport a disk, restore from backup, etc). And that’s it.

Now he wants to show a failover. Right now, the UI is buggy and doesn’t show a completed copy. Check the event logs. He copies files to the volume in the source site. Then moves the volume to the DR site. Now the replicated D: drive appears (it was offline) and all the files are there in the DR site ready to be used.

After the Preview?

Storage Spaces Shared Nothing – Low Cost

This is a no-storage-tier converged storage cluster. You create storage spaces using internal storage in each of your nodes. To add capacity you add nodes.

You get rid of the SAS layer and you can use SATA drives. The cost of SSD plummets with this system.

You can grow pools to hundreds of disks. A scenario is for primary IaaS workloads and for storage for backup/replication targets.

There is a prescriptive hardware configuration. This is not for any server from any shop. Two reasons:

Lots of components involved. There’s a lot of room for performance issues and failure. This will be delivered by MSFT hardware partners.
They do not converge the Hyper-V and storage clusters in the diagram (above). They don’t recommend convergence because the rates of scale in compute and storage are very different. Only converge in very small workloads. I have already blogged this on Petri with regards to converged storage – I don’t like the concept – going to lead to a lot of costly waste.

VM Storage Resiliency

A more graceful way of handling a storage path outage for VMs. Don’t crash the VM because of a temporary issue.

CPS – But no … he’s using this as a design example that we can implement using h/w from other sources (soft focus on the image).

Not talked about but in Q&A: They are doing a lot of testing on dedupe. First use case will be on backup targets. And secondary: VDI.

Data consistency is done by a Storage Bus Layer in the shared notching Storage Spaces system. It slips into Storage Spaces and it’s used to replicate data across the SATA fabric and expands its functionality. MSFT thinking about supporting 12 nodes, but architecturally, this feature has no limit in the number of nodes.

Technorati Tags: Event Notes,Windows Server 2015,Storage,Storage Spaces,Failover Clustering,Networking

Hardware

Storage Spaces

Test Results

Hyper-V

Windows Client

Azure

System Center

Office 365

EMS

Security

Miscellaneous

Hyper-V

Windows Server

Windows Client

Azure

Miscellaneous

Hyper-V

Windows Server

Windows Client

System Center

Azure

Office 365

Intune

Licensing

Miscellaneous

Deploy Nodes

Install Roles & Features

Validate the Cluster

Create your new cluster

What about Quorum?

Enable Client Communications

Enable Storage Spaces Direct

Browsing Around FCM

Creating Virtual Disks and CSVs

Create a Scale-Out File Server

Create and Permission Shares

Create Hyper-V VMs

Remember that this is a Preview Release

Current WS2012 R2 Scale-Out File Server

Introducing Storage Spaces Direct (S2D)

S2D Deployment Choice

Choosing Between Shared JBODs and DAS

Under The Covers

Data Placement

Scalability

Availability

ReFS – Data Integrity

ReFS – Speed and Efficiency

Storage Management in the Private Cloud

Monitoring

Hardware Platforms

S2D Development Partners

FAQ

The Importance of RDMA

Questions from the Audience