Microsoft has just posted a new KB article for a clustered Hyper-V host scenario:
Assume you have a 4 nodes Hyper-V cluster with more than 200 Virtual machines and 10 physical network adapters installed on the cluster node, each virtual machine is configured with 2 virtual network adapters; if you start 50 virtual machines on a single node at the same time or you failover 50 virtual machines to another node, you will find virtual machine configuration resources fail to be online after pending state.
When a virtual machine configuration resource is online, multiple WMI queries will be sent to query the network properties. The number of queries is decided by the number of virtual machines in the cluster and physical network adapters on the cluster node. In the scenario described in Symptoms section, it takes more than 10 minutes for all virtual machine configuration resources online. However, the default resource deadlock timeout is 5 minutes, so you will see resource online failure due to timeout.
The solution is:
Change the virtual machine configuration resource DeadlockTimeout and PendingTimeout value. The exact value depends on the cluster environment.
“The number of queries is decided by the number of virtual machines in the cluster and physical network adapters on the cluster node”
Does that mean if you add more physical network adapters the problem would actually get worse and more WMI queries would be generated?
WMI and NICs is a bad combination?
Look at this one: http://blogs.technet.com/b/kevinholman/archive/2011/12/12/opsmgr-network-utilization-scripts-in-baseos-mp-version-6-0-6958-0-may-cause-high-cpu-utilization.aspx
We have 8 NICs per W2008R2SP1 HyperV host monitored by OpsMgr, and yes we do see lengthy spikes as shown in the TechNet Blogpost. Doesn’t cause us CPU performance issues yet, so it is still enabled.
We did disable this part of the MP for W2003 entirly. Multiple symptoms described did actually occur.