“What’s the rule of thumb on the number of VMs you should put in a CSV?” That’s a question I am asked on a regular basis. We need to dig into this.
When you have a cluster of virtualisation hosts using shared storage systems, you need some sort of orchestration to say which host should access what folders and files. That’s particularly important during Live Migration and failover. Without orchestration you’d have chaos, locks, failed VMs, and corruption.
One virtualisation cluster file system out there does it’s orchestration in the file system itself. That, in theory, places limits on how that file system can scale out.
Microsoft took a different option. Instead, each cluster shared volume (CSV) has an orchestrator known as a CSV Coordinator that is automatically created and made fault tolerant. The CSV coordinator is a highly available function that runs on one of the clustered hosts. By not relying on the file system, Microsoft believes they have a more scalable and better performing option.
How scalable? A few years ago, EMC (I believe it was EMC, the owner of VMware, but my memory could be failing me) stood on a stage at a Microsoft conference and proclaimed that they couldn’t find a limit on the scalability of CSV versus performance on their storage platform. In other words, you could have a monstrous CSV and place lots and lots of 64 TB VHDX files on there (GPT volumes grow up to 16 EB).
OK; back to the question at hand: how many VMs should I place on a CSV. I have to give you the consultant’s answer: that depends. The fact is that there is no right answer. This isn’t VMware where there are prescribed limits and you should create lots of lots of “little” VMFS volumes.
First, I’d say you should read my paper on CSV and backup. Now keep in mind that the paper was written for Windows Server 2008 R2. Windows Server 2012 doesn’t do redirected I/O (mode) when backing up VMs from CSV. In that document I talk about a process I put together for CSV design and VM placement.
Next, have a look at Fast Track, Microsoft’s cloud architecture. In there they have a CSV design where OS, page file, sequential files, and non-sequential files are split into VHDs on different CSVs. To me, this complicates things greatly. I prefer simplicity. Plus I can’t imagine the complexity of the deployment automation for this design.
An alternative is to look at a rule of thumb that many are using: they have 1 CSV for every host in their cluster (or active site in a multi-site cluster). Beware here: you don’t want to run out of SCSI-3 reservations (every SAN has an unadvertised limit) because you’ve added too many CSVs on your SAN (read the above paper to learn more).
My advice: keep it simple. Don’t overthink things. Remember, Hyper-V is not VMware and VMware is not Hyper-V. They both might be enterprise virtualisation platforms but e do things differently on both platforms because they both work differently.