Powered Down Virtual Machines on a Hyper-V Cluster

From time to time, I’ll be asked to power down virtual machines in our production environment.  I also run a test virtual machine on the cluster to test things like Live Migration after doing upgrade work.  Normally, I’d like to keep it powered down, just to save 512MB of RAM and the occasional CPU cycle.  But it seems to me, that Microsoft does not like us to keep powered down virtual machines on the cluster.

My first clue was in VMM.  VMM tries to protect the cluster reserve in a Hyper-V cluster.  In other words, VMM will change the status of a cluster object to a warning if you overcommit the resources.  For example, if you have 58GB of RAM for VM’s across your N+1 3 node cluster, then it’ll complain when you deploy 58GB+ of VM RAM.  One would assume that VMM would only calculate the running VM’s.  However, I can confirm that it does include the RAM assignments to powered down VM’s as well.  I can understand this conservative approach … it’s the sort of thing a banker would do if they didn’t want to bankrupt their bank’s loan book ;-)  You have to allow for a scenario where the VM will be powered up.  Who’s to say that there isn’t a tester or developer at the other end of a Self-Service Portal, consuming their quota points, and eager to power up the VM’s at any moment.

The next clue is in OpsMgr.  I’ve imported the Microsoft Windows Cluster management pack.  A highly available virtual machine is a resource from clustering’s point of view.  Surely you deployed it on a cluster (as a highly available virtual machine) for a reason?  Shouldn’t it be running?  That’s how the management pack sees it.  An object is created in OpsMgr for every monitored cluster resources, i.e. virtual machine, and its status will go to critical if the resource is stopped, i.e. the virtual machine is powered down.  You’ll get an alert and notifications will go out.  If you are running SLA reporting then you’ll get a nice red mark all over your SLA.  Whoops!

So what should you do with those powered down VM’s?  If it is going to be down for a long time then you should move it to the VMM library.  There you have cheaper storage, and hopefully lots of it.  Importantly, the VMM cluster reserve will be OK.  OpsMgr will stop complaining after a little while about a failed cluster resource.

What if this power down is a short term thing?  You should obviously add resources to the cluster to resolve the VMM cluster reserve warning because you won’t have an N+1 (or greater) cluster with enough resources to handle a failed host (or hosts).  You can use the Health  Explorer in OpsMgr to put the critical resource (the powered down VM) in the cluster into maintenance mode, thus eliminating alerts.  You should do that before powering down the VM.

Long term, if lots of VM’s will be powered down and up, you might want to create a dedicated, lower priority, cluster for this.  You can customize the monitoring not to care about cluster resources being up or down.  You can probably safely ignore warnings about VMM cluster reserve being exceeded too.

2 thoughts on “Powered Down Virtual Machines on a Hyper-V Cluster”

  1. Good post, txs. for the insight about this.

    What if you have the resource in the cluster set to “don’t power up this resource, upon failure”. (I don’t recall the exact phrase, but you get my drift). The you would deliberate have told the cluster that you don’t want the resource to be HA, but the resource is still in the cluster.

    Do opsmgr take this “flag” in consideration ? or does it still mark the node as failed when you have shut down the guest nice and clean ?

    1. I’ll be honest – I don’t know. I would want the resource to power up, upon failure -> it’s kind of the reason for having the cluster :-). Good question though, maybe there is something, somewhere that adjusts this behaviour … but I doubt it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.