The Effects Of WS2012 R2 Storage Spaces Write-Back Cache

In this post I want to show you the amazing effect that Write-Back Cache can have on the write performance of Windows Server 2012 R2 Storage Spaces.  But before I do, let’s fill in some gaps.

Background on Storage Spaces Write-Back Cache

Hyper-V, and many other applications/services/etc, does something called write-through.  In other words, it bypasses write caches of your physical storage.  This is to avoid corruption.  Keep this in mind while I move on.

In WS2012 R2, Storage Spaces introduces tiered storage.  This allows us to mix one tier of HDD (giving us bulk capacity) with one tier of SSD (giving us performance).  Normally a heap map process runs at 1am (task scheduler, and therefore customisable) and moves around 1 MB slices of files to the hot SSD tier or to the cold HDD tier, based on demand.  You can also pin entire files (maybe a VDI golden image) to the hot tier.

In addition, WS2012 R2 gives us something called Write-Back Cache (WBC).  Think about this … SSD gives us really fast write speeds.  Write caches are there to improve write performance.  Some applications are using write-through to avoid storage caches because they need the acknowledgement mean that the write really went to disk.

What if abnormal increases in write behaviour led to the virtual disk (a LUN in Storage Spaces) using it’s allocated SSD tier to absorb that spike, and then demote the data to the HDD tier later on if the slices are measured as cold.

That’s exactly what WBC, a feature of Storage Spaces with tiered storage, does.  A Storage Spaces tiered virtual disk will use the SSD tier to accommodate extra write activity.  The SSD tier increases the available write capacity until the spike decreases and things go back to normal.  We get the effect of a write cache, but write-through still happens because the write really is committed to disk rather than sitting in the RAM of a controller.

Putting Storage Spaces Write-Back Cache To The Test

What does this look like?  I set up a Scale-Out File Server that uses a DataOn DNS-1640D JBOD.  The 2 SOFS cluster nodes are each attached to the JBOD via dual port LSI 6 Gbps SAS adapters.  In the JBOD there is a tier of 2 * STEC SSDs (4-8 SSDs is a recommended starting point for a production SSD tier) and a tier of 8 * Seagate 10K HDDs.  I created 2 * 2-way mirrored virtual disks in the clustered Storage Space:

  • CSV1: 50 GB SSD tier + 150 GB HDD tier with 5 GB write cache size (WBC enabled)
  • CSV2: 200 GB HDD tier with no write cache (no WBC)

Note: I have 2 SSDs (sub-optimal starting point but it’s a lab and SSDs are expensive) so CSV1 has 1 column.  CSV2 has 4 columns.

Each virtual disk was converted into a CSV, CSV1 and CSV2.  A share was created on each CSV and shared as \Demo-SOFS1CSV1 and \Demo-SOFS1CSV2.  Yeah, I like naming consistency Smile

Then I logged into a Hyper-V host where I have installed SQLIO.  I configured a couple of params.txt files, one to use the WBC-enabled share and the other to use the WBC-disabled share:

  • Param1.TXT: \demo-sofs1CSV1testfile.dat 32 0x0 1024
  • Param2.TXT \demo-sofs1CSV2testfile.dat 32 0x0 1024

I pre-expanded the test files that would be created in each share by running:

  • "C:Program Files (x86)SQLIOsqlio.exe" -kW -s5 -fsequential -o4 –b64 -F"C:Program Files (x86)SQLIOparam1.txt"
  • "C:Program Files (x86)SQLIOsqlio.exe" -kW -s5 -fsequential -o4 -b64 -F"C:Program Files (x86)SQLIOparam2.txt"

And then I ran a script that ran SQLIO with the following flags to write random 64 KB blocks (similar to VHDX) for 30 seconds:

  • "C:Program Files (x86)SQLIOsqlio.exe" -BS -kW -frandom -t1 -o1 -s30 -b64 -F"C:Program Files (x86)SQLIOparam1.txt"
  • "C:Program Files (x86)SQLIOsqlio.exe" -BS -kW -frandom -t1 -o1 -s30 -b64 -F"C:Program Files (x86)SQLIOparam2.txt"

That gave me my results:

image

To summarise the results:

The WBC-enabled share ran at:

  • 2258.60 IOs/second
  • 141.16 Megabytes/second

The WBC-disabled share ran at:

  • 197.46 IOs/second
  • 12.34 Megabytes/second

Storage Spaces Write-Back Cache enabled the share on CSV1 to run 11.44 times faster than the non-enhanced share!!!  Everyone’s mileage will vary depending on number of SSDs versus HDDs, assigned cache size per virtual disk, speed of SSD and HDD, number of columns per virtual hard disk, and your network.  But one thing is for sure, with just a few SSDs, I can efficiently cater for brief spikes in write operations by the services that I am storing on my Storage Pool.

Credit: I got help on SQLIO from this blog post on MS SQL Tips by Andy Novick (MVP, SQL Server).

40 thoughts on “The Effects Of WS2012 R2 Storage Spaces Write-Back Cache”

  1. This appears to be applicable only to larger installations.
    Can it be used for sites with only one or two Hyper-V hosts servers (and no NAS)?
    I.e., can a single stand-alone Hyper-V server use an SSD drive for write-back cache?

    1. WBC is a Storage Spaces feature. You must deploy Storage Spaces on supported h/w and deploy a sufficient number of disks. So: no.

  2. Hi Aidan

    Thanks for providing all the good information about Storage Spaces! I have learned a lot on your page!

    I have a question regarding your speed measures… can that really be true? While increasing the by factor 10 is nice, to me it still looks like very poor performance.

    12.34MB/s on a system with 8 x 10k SAS disk? I almost cannot believe this.
    The hardware should be able to write much faster, or not?

    Would be great to have some clarification from your side.

    Thanks

  3. I agree with Severin. Why is performance so bad to begin with. Almost seems like this technology fixes halfway, problem created by itself.

  4. Remember that you are dealing with RANDOM IO.

    Most HDD benchmarks focus on SEQUENTIAL read/write tests. random read/write tests KILL throughput.

  5. Question, your tiered storage was 25% SSD. What if you allocated something closer to 10% SSD with WBC. How would performance compare? It seems to me there are so many components that improve performance you could scale the SSD even better than a 1:4 ratio??

    1. Write-Back Cache is 1 GB by default if you have a tiered storage pool. MSFT recommends not changing that – this is a temporary write buffer (with committed writes to disk).

      1. Sorry, my question wasn’t clearly written. I’m not asking to change the WBC. I’m wondering if performance would change much if the storage demographic was changed to a smaller percentage of SSD, but keep the same WBC that you already have? Thanks!!

        1. Ah – Having a smaller portion of SSD will impact the performance of a tiered storage virtual disk, obviously, because you would have more data living on the cold tier. But that depends … maybe you have a small “working set” of data so only a small amount of data needs to live on SSD. As for WBC performance – your mileage will vary … in the real world there will be contention for IOPS. SSD has a LOT of IOPS … it’s all a big “it depends” on your h/w, the data, the nature of the data usage, and your workloads … just like it would in a SAN.

  6. I’m interested to see how SSD tiering improves the poor performance of parity spaces? Is there a reason you tested a 2w mirror instead of a parity space?

    1. Yes- parity spaces is only approved for archive workloads. Use 2-way or 3-way mirrors to store VMs.

  7. Hey, thanks for the blog!
    Sorry, if i misunderstood this, but is it possible to enable WBC only on one SSD. For example, if you have say 4 HDDs and 1 SSD and you put them together in a parity-pool.

    Will it work or do you have to use 2 SSDs with 4 HDDs?

    I heard different stories about this.

    1. No idea. But Parity is intended for archive workloads only. If you want to run VMs then you run 2-way or 3-way mirror.

  8. if write back cache is enabled, when one of the cluster Node write the data to the cache and HW failure, will the surviving cluster node able to retrieve the data from the write back cache?

  9. Can you please let me know of the performance impact on VM’s running on Hyper-V, stored on SMB3 on a storage space? Specifically, I am concerned about this scenario: let’s say a client Hyper-v VHDX file is 2TB in size – but I only have say 240GB in SSD in my tiered storage pool. Does it still provide any benefit or is it not able to work with the monolithic (and larger than the cache pool) Hyper-V VHDX file?

    Thanks very much.

  10. I’m curious if you’ve been able to find information on how frequently the write-back cache is emptied?
    The only gauge I’ve been able to find comes from a Fujitsu review (http://globalsp.ts.fujitsu.com/dmsp/Publications/public/wp-windows-storage-spaces-r2-performance-ww-en.pdf) which states “As of a specific filling degree (usually between 25% and 50%) Storage Spaces starts to empty the cache by moving the data to the hard disks”

    A 1GB WBC that hasn’t paged out 50% of its data only leaves 500MB of available cache.
    I haven’t been able to find a way to tune this filing/page-out rate unfortunately.
    It would be nice to have some control over this to page out earlier or later!

  11. I’m setting up a 3 JBOD system with just 12x 10k drives initially. I was going to configure these as a 6 column 2way mirror (4 drives in each JBOD) – but what would be the minimum number of SSD’s I would need to have an SSD write-back cache enabled?

    I don’t think I’m that bothered about full SSD tiered storage (mainly due to the cost!) – but do you think the write cache would help smooth out / improve write performance?

    Also, as the cache defaults to 1GB it seems a waste to use large SSD drives. Are there any smaller / cheaper SAS SSD’s drives you could recommend?

    Thanks for all your info and articles btw, they’ve all come in VERY useful! 🙂

  12. Isn’t this like an apples to oranges comparison. If you wanted to test benefits of WBC only, shouldn’t you have both CSV the same but one with WBC disabled?
    Or is setup impossible?

    CSV1: 50 GB SSD tier + 150 GB HDD tier with 5 GB write cache size (WBC enabled)
    CSV2: 50 GB SSD tier + 150 GB HDD tier with 0 GB write cache size (No WBC)

  13. Is it necessary to create a tiered spaces first to enable Write Back cache? Is this cache available only for particular space/pool? It would be nice to have separate pool consist of SSD disks only as and they will be the “central” WBC for all pools in the pools. Is there a possibility for this scenario? THX

    1. You need SSD and HDD in your pool to create WBC. Your virtual disk does not need to be tiered.
      There is no benefit to the scenario you’ve discussed. Consider that WBC is recommended to be 1 GB only per virtual disk.

  14. 8.1 does support write back cache. For example, create a storage pool called “HybridPool” with 4 x 1 TB drives and 2 x 120GB SSDs. Then run the following command:

    New-VirtualDisk -StoragePoolFriendlyName “HybridPool” -FriendlyName “Parity Protected” -ResiliencySettingName Parity -NumberOfColumns 4 -WriteCacheSize 100GB -UseMaximumSize

    The new parity space has great performance and ~3TB of fault tolerant space.

  15. So…1gb per virtual disk…what does SS do with Unidesk, where each VM is easily composed of a dozen+ VHDXs? does that mean 1gb WBC per virtual disk? i feel like this combo could do amazing things for MS VDI – scalable, performant, affordable…amazing VDI.

  16. Hi Aidan,
    we use the windows storage space connected to multiple servers Linux and Windows and works fine. We use it in iSCSI mainly.
    But, I don’t understand the maximum of WBC size that I can set on vdisk.
    For now we have one pool with 2 vdisk, one Parity with 90GB WBC for storing backup data, and another in Mirror tiered with 11GB WBC.

    My question is, what the maximum of WBC size that I can set it?
    There’s a maximum WBC size for pool?

    Thanks

    Mario

      1. Hi, Great article!
        I know it’s not recommended, but I’m using a Dell MD1200 with “SATA” SSD (Crucial M4 960 GB). What happens is that when I do a single volume directly from the physical Disk I have typical SATA 3 perfs for this class of SSD (arround 170 MB/s for R/W 4K 32QL tests).
        But when I set up a Storage Pool with this only one SSD, and then create a vDisk with default WBC values, my perfs drops to 170 MB/s for readings and 2 (!!!) MB/s for writing.
        I am now testing with a WBC forced to 50GB and I see greatly improved performances (arround 160 MB/s in R/W) but since it’s not recommended by MSFT I am looking for explainations/reasons: why my perfs are so bad with 1 GB default WBC…?
        AND, what happens if I set WBC to… 1TB? I got 6 SSDs of 960GB and 6HDD of 4TB… 3col 2-way mirror.
        Thanks for your feedback 🙂

        1. I have similar situation. I’ve assembled tiered SS with 8 ssd seagate pulse 100GB + 16 hitachi 2TB 7k (all SAS), and created 4 CSV volumes with 4 column each and with 5GB WBC. Read performance is good but random write is terribly low (0,5.. 2MBps) as reported by CrystalDiskMark. For now I try to investigate this, because such performance is nonsense even in comparison with 4 x 7k hdd raid without WBC.

          1. A) Use DiskSpd to test.
            B) Performance of storage space is dictated by (1) resiliency [don’t use partity for VMs], (2) the hardware [there are only 2-3 reliable JBODs, and so far, only HGST appear to make reliable SSDs], [3] your design.

  17. Hi Aidan,
    thanks for this article, sorry to bring it up again! In 2011 you wrote an article about not using multiple CSV’s or LUNS as there was no perfromance benefit in a hyper-v scenario. Does this article change your mind about having multiple CSV’s based on corresponding vdisks(luns)? Would it now be best to utilise the 1GB of WBC per vdisk and have multiple vdisks and add those as csv’s for storing individual VHD files?
    Hope that makes sense, appreciate your input.

Leave a Reply to AFinn Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.