Speaker: Brian Matthew
Start was some metrics achieved stuff. Summary: Lots of IOPS.
Hardware
It’s simple and cost effective. Goes from basic to OLTP workloads.
Capabilities Overview
Storage Pools support ReFS
2 * JBODs. We create a single storage pool to aggregate all 48 disks ( 2 * 24 in this example. We create 1 * 2-way mirror spaces and 1 * parity space.
- Flexible resilient storage spaces.
- Native data striping maximizes performance
- Enclosure awareness with certified hardware.
- Data Integrity Scanner (aka “scrubber”) with NTFS and ReFS
- Continuous Availability with Windows Server failover clustering – SOFS
Data is spread around the disks in the storage pool. They parallelize the rebuild process.
8 * 3 TB disk test bed. Test the failure of the disk. Can rebuild in 50 minutes, with > 800 MB/s rebuild throughput. The line is that hot spare is no longer necessary in WS2012 R2. Hmm. Must look into that.
Scale-Out Example
Note: CSV scales out linearly
Match workload characteristics to drives
- Capacity optimized drives have lower performance. Higher TB/$
- High performance drives has lower capacity/host. Higher IOPS/$
Can we seamlessly merge these?
Tiered Storage Spaces
A single virtual disk can use the best of both types of disk. High capacity for colder slices of data. High speed for hotter slices of data.
The most compelling ratio appears to be 4 to 12 SSDs in a 60 slot device, with the rest of the disks being HDDs.
In the background, the file system actively measures the activity of file slices. Transparently moves hot slices to the SSD tier, and cold slices to the HDD tier.
Tiering (analysis and movement) is done daily. The schedule is configurable (change time, do it more than daily). The slices are 1 MB in size. So tracking watches 1 MB slices, and tiering is done on 1 MB slices.
Administrators can pin entire files to specified tiers. Example, move a VDI parent VHDX to the SSD tier.
Write-Back Cache
Per virtual disk, persistent write cache. It smoothens out write bursts to a virtual disk. Uses the SSD capacity of the pool for increased IOPS capacity. Configurable using PowerShell. Great for Hyper-V which needs write-through, instead of battery powered write cache.
PowerShell Demo
Get-PhysicalDisk to list the possible disks to use “CanPool”attribute.
$disks = Get-PhyscalDisks
New-StoragePool …. $disks
Get-StoragePool to see the disks. Look at FriendlyName and MediaType attributes.
$SSD_Tier = New-StorageTier … list the SSDs
$HDD_Tier = New-StorageTier … list the HDDs
$vd1 = New-VirtualDisk …. –StorageTiers @($ssd_tier, $hdd_tier) –StorageTierSizes @(150GB, 1.7TB) ….
Now we have a drive with automated scheduled storage tiering.
Pins some files using Set-FileStorgeTier
Optimize-Volume –DriveLetter E –TierOptimize ….. this will force the maintenance task to run and move slices.
Demo: Write-Back Cache
He increases the write workload to the disk. A quick spike and then the SSD takes over. Then increases again and again, and the write-back cache absorbs the spikes.
Question: How many tiers are supported in WS2012 R2? 2. But the architecture will allow MSFT to increase this in later releases if required.
Right now, certified clustered storage spaces from:
- DataOn
- RAID Incorporated
- Fujitsu
Takeaways
- WS2012 R2 is a key component in the cloud:cost efficient
- Scalable data access: capacity and performance
- Continuously available
- Manageable from Server Manager, PoSH, and SCVMM (including SOFS bare metal deployment from template.
Q&A
No docs on sizing Write-Back Cache. They want the WBC to be not too large. Up to 10 GB is being recommended right now. You can reconfigure the size of the WBC after the fact … so monitor it and change as required.
On 15K disks: Expensive and small. Makes sense to consider SSD + 7.5K disks in a storage pool rather than SSD + 15 K in a storage pool.
He can’t say it, but tier 1 manufacturers are scared *hitless of Storage Spaces. I also hear one of them is telling porky pies to people on the Expo floor re the optimization phase of Storage Spaces, e.g. saying it is manual.
Is there support for hot spares? Yes, in WS2012 and R2. Now MSFT saying you should use space capacity in the pool with parallelized repair across all disks in the pool, rather than having a single repair point.
DeFrag is still important for contiguous data access.
If I have a file on the SSD tier, and the tier is full, writes will continue OK on the lower tier. The ReFS integrity stream mechanism can find best placement for a block. This is integrated with tiered storage spaces.
On adding physical disks to the storage space: old data is not moved: instant availability. New writes are sent to the new disks.
A feature called dirty region table protects the storage space against power loss caused corruption.
Should hard drive caches be turned off? For performance: turn it off. For resilience, turn it on. Note, a cluster will bypass the disk cache with write-through.
There is some level of failure prediction. There are PoSH modules for detecting issues, e.g. higher than normal block failure rates, or disks that are slower than similar neighbours.
Ah, the usual question: Can the disks in a storage space span data centers. The members of a storage pool must be connected to all nodes in a SOFS via SAS, which makes that impossible. Instead, have 2 different host/storage blocks in 2 sites, and use Hyper-V Replica to replicate VMs.
Virtual Disk Deployment Recommendations
When to use Mirror, Parity, or Simple virtual disks in a storage space?
A storage space will automatically repair itself when a drive fails – and then it becomes resilient again. That’s quick thanks to parallelized repair.
Personal Comment
Love hearing a person talk who clearly knows their stuff and is very clear in their presentation.
—
Holy crap, I have over a mile to walk to get to the next storage session! I have to get out before the Q&A ends.
Thanks for this Aidan, some exciting stuff in there!
This post is to the point. I have enjoyed and learned a lot from your posts from clustering to storage spaces. I am in a bit of pickle right now wondering if you can shed some light on it. We are testing 2 way mirror our current setup is 3 jbods, 3rd jbod has 2 drives that were manually selected for a another volume. Then rest of the drives accross remaining 2 JBODs were made part of the 2 way mirror ( drive configuration are same on both JBOD) , to test resiliency we turn off one of the JBOD units the virtual disk shows up as degraded which is fine so we turn the unit back online do repair virtual disk but that has no effect. What are we doing wrong how is the best way to test resiliency of the 2 way mirror with our setup. Your help/guidance is much appreciated.