Here I am, working on a Sunday (when I wrote this post). It’s not so bad, it’s raining outside, so that rules out going for a walk or doing some photography. I jumped onto Twitter and saw someone moaning that they had to work on a Sunday to patch their Hyper-V cluster. To me that’s a WTF! moment.
Windows Server 2012 Failover Clustering gives us Cluster Aware Updating (CAU). Using this you can patch a Hyper-V cluster without getting manually involved in “maintenance modes” and Live Migration. The process will:
- Download updates from Microsoft, WSUS, etc, or a file share, to the hosts (and this is expandable to 3rd party updates such as OEMs).
- Put host 1 into maintenance mode – that drains it of virtual machines using Live Migration and … Quick Migration (for VMs marked as LOW priority, by default, which I DO NOT agree with). You can make it 100% Live Migration so no services suffer an outage during the moves. The more bandwidth your Live Migration network has, the faster this will be – using 1 Gbps networking for 512 GB RAM hosts is stupid!
- Patch and reboot host 1
- Wait for host 1 to come back online
- Bring host 1 out of maintenance mode
- Repeat steps 2-5 for each host
This process orchestrates the entire process. All you’ve go to do is make it happen:
- You can manually invoke CAU from a Failover Cluster Manager console not running on a cluster member
- You can set up a special CAU role on the cluster with a patching schedule – it’s a clustered role so it will move just like the VMs
And the process is customizable, e.g. don’t proceed/continue if Y hosts are offline.
So … let me ask you a question. If your VMs are moving around using Live Migration, and their services never go offline … why do you need a maintenance window? Why exactly do you want to be a sad bastard like me and work on a Sunday?
Me, I think I’d do my host patching on a Wednesday morning, at around 11am, in a typical business. Why? A few reasons:
- Live Migration keeps services online so the business should not notice.
- I’m “in” the office already. If something does go wrong, I am not getting a call at 3am or at the weekend. I’m sober, awake (as much as I will be, anyway), and able to respond immediately.
- Any support services will have their primary staff available. If I do need to call someone for hardware or software support, they are online, and I’m not dealing with the red-eye team at 3am on a Sunday morning.
- I can monitor for exceptions quite happily.
- The business doesn’t need to pay me overtime or give me time-in-lieu.
- Peak business in IT is at either end of the week (“password reset Monday” and “I didn’t want to bother you” Friday afternoons) so Wednesday seems like a nice balance.
So yeah, I do think that CAU should kill the Hyper-V cluster patching window.
Edit 1:
The same person was on Twitter many hours later, complaining that patching Hyper-V took them “11 hours”. Really!?!?! Hmm, I think if that was me I’d be asking what I was doing wrong. Just sayin’ is all …
You can learn more about Windows Server 2012 Hyper-V from the book, Windows Server 2012 Hyper-V Installation And Configuration Guide: