{"id":12844,"date":"2012-06-18T10:12:17","date_gmt":"2012-06-18T09:12:17","guid":{"rendered":"https:\/\/aidanfinn.com\/?p=12844"},"modified":"2012-06-18T10:12:17","modified_gmt":"2012-06-18T09:12:17","slug":"cluster-shared-volumes-reborn-in-ws2012-deep-dive","status":"publish","type":"post","link":"https:\/\/aidanfinn.com\/?p=12844","title":{"rendered":"Cluster Shared Volumes Reborn in WS2012: Deep Dive"},"content":{"rendered":"<p>Noes from TechEd North America 2012 session <a href=\"http:\/\/channel9.msdn.com\/Events\/TechEd\/NorthAmerica\/2012\/WSV430\" target=\"_blank\">WSV430<\/a>:<\/p>\n<p><a href=\"http:\/\/channel9.msdn.com\/Events\/TechEd\/NorthAmerica\/2012\/WSV430\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; margin-left: auto; border-top: 0px; margin-right: auto; border-right: 0px; padding-top: 0px\" title=\"image\" border=\"0\" alt=\"image\" src=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image24.png\" width=\"404\" height=\"184\" \/><\/a><\/p>\n<p><strong><span style=\"text-decoration: underline\">New in Windows Server 2012<\/span><\/strong><\/p>\n<ul>\n<li>File services is supported on CSV for application workloads.&#160; Can leverage SMB 3.0 and be used for transparent failover Scale-Out File Server (SOFS) <\/li>\n<li>Improved backup\/restore <\/li>\n<li>Improved performance with block level I\/O redirection <\/li>\n<li>Direct I\/O during backup <\/li>\n<li>CSV can be built on top of Storage Spaces <\/li>\n<\/ul>\n<p><strong><span style=\"text-decoration: underline\">New Architecture<\/span><\/strong><\/p>\n<ul>\n<li>Antivirus and backup filter drivers are now compatible with CSV.&#160; Many are already compatible. <\/li>\n<li>There is a new distributed application consistent backup infrastructure. <\/li>\n<li>ODX and spot fixing are supported <\/li>\n<li>BitLocker is supported on CSV <\/li>\n<li>AD not longer a dependency (!?) for improved performance and resiliency. <\/li>\n<\/ul>\n<p><strong><span style=\"text-decoration: underline\">Metadata Operations<\/span><\/strong><\/p>\n<p>Lightweight and rapid.&#160; Relatively infrequent with VM workloads.&#160; Require redirected I\/O.&#160; Includes:<\/p>\n<ul>\n<li>VM creation\/deletion <\/li>\n<li>VM power on\/off <\/li>\n<li>VM mobility (live migration or storage live migration) <\/li>\n<li>Snapshot creation <\/li>\n<li>Extending a dynamic VHD <\/li>\n<li>Renaming a VHD <\/li>\n<\/ul>\n<p>Parallel metadata operations are non disruptive.<\/p>\n<p><strong><span style=\"text-decoration: underline\">Flow of I\/O<\/span><\/strong><\/p>\n<ul>\n<li>For non-metadata IO: Data sent to the CSV Proxy File System.&#160; It then routes to the disk via CSV VolumeMgr via direct IO. <\/li>\n<li>For metadata redirected IO (see above): We get SMB redirected IO on non-orchestrator (not the CSV coordinator\/owner for the CSV in question) nodes.&#160; Data is routed via SMB redirected IO by the CSV Proxy File System to the orchestrator via the cluster communications network so the orchestrator can handle the activity. <\/li>\n<\/ul>\n<p><a href=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image20.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px\" title=\"image\" border=\"0\" alt=\"image\" src=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image_thumb19.png\" width=\"354\" height=\"249\" \/><\/a><\/p>\n<p><strong><span style=\"text-decoration: underline\">Interesting Note<\/span><\/strong><\/p>\n<p>You can actually rename C:ClusterStorageVolume1 to something like C:ClusterStorageCSV1.&#160; That\u2019s supported by CSV.&#160; I wonder if things like System Center support this?<\/p>\n<p><strong><span style=\"text-decoration: underline\">Mount Points<\/span><\/strong><\/p>\n<ul>\n<li>Used custom reparse points in W2008 R2.&#160; That meant backup needed to understand these. <\/li>\n<li>Switched to standard Mount Points in WS2012. <\/li>\n<\/ul>\n<p>Improved interoperability with:<\/p>\n<ul>\n<li>Performance coutners <\/li>\n<li>OpsMgr (never had free space monitoring before) <\/li>\n<li>Free space monitoring (speak of the devil!) <\/li>\n<li>Backup software can understand mount points. <\/li>\n<\/ul>\n<p><strong><span style=\"text-decoration: underline\">CSV Proxy File System<\/span><\/strong><\/p>\n<p>Appears as CSVFS instead of NTFS in disk management.&#160; NTFS under the hood.&#160; Enabled applications and admins to be CSV aware.<\/p>\n<p><strong><span style=\"text-decoration: underline\">Setup<\/span><\/strong><\/p>\n<p>No opt-in any more.&#160; CSV enabled by default.&#160; Appears in normal storage node in FCM.&#160; Just right click on available storage to convert to CSV.<\/p>\n<p><strong><span style=\"text-decoration: underline\">Resiliency<\/span><\/strong><\/p>\n<p>CSV enables fault tolerance file handles.&#160; Storage path fault tolerance, e.g. HBA failure.&#160; When a VM opens a VHD, it gets a virtual file handle that is provided by CSVFS (metadata operation).&#160; The real file handle is opened under the covers by CSV.&#160; If the HBA that the host is using to connect the VM to VHD fails, then the real file handle needs to be recreated.&#160; This new handle is mapped to the existing virtual file handle, and therefore the application (the VM) is unaware of the outage.&#160; We get transparent storage path fault tolerance.&#160; The fault tolerant SAN connectivity (remember that direct connection via HBA has failed and should have failed the VM\u2019s VHD connection) is re-routed by Redirected IO via the Orchestrator (CSV coordinator) which \u201cproxies\u201d the storage IO to the SAN.<\/p>\n<p><a href=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image21.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px\" title=\"image\" border=\"0\" alt=\"image\" src=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image_thumb20.png\" width=\"404\" height=\"232\" \/><\/a><\/p>\n<p>If the Coordinator node fails, IO is queued <em>briefly<\/em> and the orchestration role fails over to another node.&#160; No downtime in this <em>brief<\/em> window.<\/p>\n<p>If the private cluster network fails, the next available network is used \u2026 remember you should have at least 2 private networks in a CSV cluster \u2026 the second private network would be used in this case.<\/p>\n<p><strong><span style=\"text-decoration: underline\">Spot-Fix<\/span><\/strong><\/p>\n<ul>\n<li>Scanning is separated from disk repair.&#160; Scanning is done online. <\/li>\n<li>Spot-fixing requires offline only to repair.&#160; It is based on the number of errors to fix rather than the size of the volume \u2026 could be 3 seconds. <\/li>\n<li>This offline does not cause the CSV to go \u201coffline\u201d for applications (VMs) using that CSV being repaired.&#160; CSV proxy file system virtual file handles appear to be maintained. <\/li>\n<\/ul>\n<p>This should allow for much bigger CSVs without chkdsk concerns.<\/p>\n<p><strong><span style=\"text-decoration: underline\">CSV Block Cache<\/span><\/strong><\/p>\n<p>This is a distributed write-through cache.&#160; Un-buffered IO is targeted.&#160; This is excluded by the Windows Cache Manager (buffered IO only).&#160; The CSV block cache is consistent across the cluster.<\/p>\n<p>This has a very high value for pooled VDI VM scenario.&#160; Read-only (differencing) parent VHD or read-write differencing VHDs.<\/p>\n<p>You configure the memory for the block cache on a cluster level.&#160; 512 MB per host appears to be the sweet spot.&#160; Then you enable CSV block cache on a per CSV basis \u2026 focus on the read-performance-important CSVs.<\/p>\n<p><strong><span style=\"text-decoration: underline\">Less Redirected IO<\/span><\/strong><\/p>\n<ul>\n<li>New algorithm for detecting type of redirected IO required <\/li>\n<li>Uses OpsLocks as a distributed locking mechanism to determine if IO can go via direct path <\/li>\n<\/ul>\n<p>Comparing speeds:<\/p>\n<ul>\n<li>Direct IO: Block level IO performance parity <\/li>\n<li>Redirected IO: Remote file system (SMB 3.0)&#160; performance parity \u2026 can leverage multichannel and RDMA <\/li>\n<\/ul>\n<p><strong><span style=\"text-decoration: underline\">Block Level Redirection<\/span><\/strong><\/p>\n<p>This is new in WS2012 and provides a much faster redirected IO during storage path failure and redirection.&#160; It is still using SMB.&#160; Block level redirection goes directly to the storage subsystem and provides 2x disk performance.&#160; It bypasses the CSV subsystem on the coordinator node \u2013 SMB redirected IO (metadata) must go through this.<\/p>\n<p><a href=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image22.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px\" title=\"image\" border=\"0\" alt=\"image\" src=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image_thumb21.png\" width=\"404\" height=\"229\" \/><\/a><\/p>\n<p>You can speed up redirected IO using SMB 3.0 features such as Multichannel (many NICs and RSS on single NICs) and RDMA.&#160; With all the things turned on, you should get 98% of the performance of direct IO via SMB 3.0 redirected IO \u2013 I guess he\u2019s talking about Block Level Redirected IO.<\/p>\n<p><strong><span style=\"text-decoration: underline\">VM Density per CSV<\/span><\/strong><\/p>\n<ul>\n<li>Orchestration is done on a cluster node (parallelized) which is more scalable than file system orchestration. <\/li>\n<li>Therefore there are no limits placed on this by CSV, unlike in VMFS. <\/li>\n<li>How many IOPS can your storage handle, versus how many IOPS do your VMs need? <\/li>\n<li>Direct IO during backup also simplifies CSV design. <\/li>\n<\/ul>\n<p>If your array can handle it, you <em>could<\/em> (and probably won\u2019t) have 4,000 VMs on a 64 node cluster with a single CSV.<\/p>\n<p><strong><span style=\"text-decoration: underline\">CSV Backup and Restore Enhancements<\/span><\/strong><\/p>\n<ul>\n<li>Distributed snapshots: VSS based application consistency.&#160; Created across the cluster.&#160; Backup applications query the CSV to do an application consistent backup. <\/li>\n<li>Parallel backups can be done across a cluster: Can have one or more concurrent backups on a CSV.&#160; Can have one or more concurrent CSV backups on a single node. <\/li>\n<li>CSV ownership does not change.&#160; There is no longer a need for redirected IO during backup. <\/li>\n<li>Direct IO mode for software snapshots of the CSV \u2013 when there is no hardware VSS provider. <\/li>\n<li>Backup no longer needs to be CSV aware. <\/li>\n<\/ul>\n<p>Summary: We get a single application consistent backup snapshot of multiple VMs across many hosts using a single VSS snapshot of the CSV.&#160; The VSS provider is called on the \u201cbackup node\u201d &#8230; any node in the cluster.&#160; This is where the snapshot is created.&#160; Will result in less data being transmitted, fewer snapshots, quicker backups.<\/p>\n<p><strong><span style=\"text-decoration: underline\">How a CSV Backup Work in WS2012<\/span><\/strong><\/p>\n<ol>\n<li>Backup application talks to the VSS Service on the backup node <\/li>\n<li>The Hyper-V writer identifies the local VMs on the backup node <\/li>\n<li>Backup node CSV writer contacts the Hyper-V writer on the other hosts in cluster to gather metadata of files being used by VMs on that CSV <\/li>\n<li>CSV Provider on backup node contacts Hyper-V Writer to get quiesce the VMs <\/li>\n<li>Hyper-V Writer on the backup node also quiesces its own VMs <\/li>\n<li>VSS snapshot of the entire CSV is created <\/li>\n<li>The backup tool can then backup the CSV via the VSS snapshot <\/li>\n<\/ol>\n<p><a href=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image23.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; border-right-width: 0px; padding-left: 0px; padding-right: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto; padding-top: 0px\" title=\"image\" border=\"0\" alt=\"image\" src=\"https:\/\/aidanfinn.com\/wp-content\/uploads\/2012\/06\/image_thumb22.png\" width=\"404\" height=\"267\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Noes from TechEd North America 2012 session WSV430: New in Windows Server 2012 File services is supported on CSV for application workloads.&#160; Can leverage SMB 3.0 and be used for transparent failover Scale-Out File Server (SOFS) Improved backup\/restore Improved performance with block level I\/O redirection Direct I\/O during backup CSV can be built on top &hellip; <a href=\"https:\/\/aidanfinn.com\/?p=12844\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Cluster Shared Volumes Reborn in WS2012: Deep Dive&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[14],"tags":[63,181,99,195,118],"class_list":["post-12844","post","type-post","status-publish","format-standard","hentry","category-eventnotes","tag-failover-clustering","tag-hyper-v","tag-storage","tag-virtualisation","tag-windows-server-2012"],"aioseo_notices":[],"jetpack_featured_media_url":"","amp_enabled":true,"_links":{"self":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/posts\/12844","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=12844"}],"version-history":[{"count":0,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/posts\/12844\/revisions"}],"wp:attachment":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=12844"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=12844"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=12844"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}