KB2986895 – VMs Lose Network Connection on WS2012 or WS2012 R2 Hyper-V When Using Broadcom 1GbE NICs

If youโ€™re affected by this issue then you should have read this post. Microsoft posted a KB article for when virtual machines lose network connectivity when you use Broadcom NetXtreme 1-gigabit network adapters on Windows Server 2012 Hyper-V or Windows Server 2012 R2 Hyper-V.

Symptoms

When you have Hyper-V running on Microsoft Windows Server 2012 or Windows Server 2012 R2 together with Broadcom NetXtreme 1-gigabit network adapters (but not NetXtreme II network adapters), you may notice one or more of the following symptoms:

  • Virtual machines may randomly lose network connectivity. The network adapter seems to be working in the virtual machine. However, you cannot ping or access network resources from the virtual machine. Restarting the virtual machine does not resolve the issue.
  • You cannot ping or connect to a virtual machine from a remote computer.

These symptoms may occur on some or all virtual machines on the server that is running Hyper-V. Restarting the server immediately resolves network connectivity to all the virtual machines.

Cause

This is a known issue with Broadcom NetXtreme 1-gigabit network adapters that use the b57nd60a.sys driver when VMQ is enabled on the network adapter. (By default, VMQ is enabled.)

The latest versions of the driver are 16.2 and 16.4, depending on which OEM version that you are using or whether you are using the Broadcom driver version. Broadcom designates these driver versions as 57xx-based chipsets. They include 5714, 5715, 5717, 5718, 5719, 5720, 5721, 5722, 5723, and 5780.

These drivers are also sold under different model numbers by some server OEMs. HP sells these drivers under model numbers NC1xx, NC3xx, and NC7xx.

Workaround

Broadcom is aware of this issue and will release a driver update to resolve the issue. In the meantime, you can work around the issue by disabling VMQ on each affected Broadcom network adapter by using the Set-NetAdapterVmq Windows PowerShell command. For example, if you have a dual-port network adapter, and if the ports are named NIC 1 and NIC 2 in Windows, you would disable VMQ on each adapter by using the following commands:

Set-NetAdapterVmq -Name “NIC 1” -Enabled $False
Set-NetAdapterVmq -Name “NIC 2” -Enabled $False

You can confirm that VMQ is disabled on the correct network adapters by using the Get-NetAdapterVmq Windows PowerShell command.

Note By default, VMQ is disabled on the Hyper-V virtual switch for virtual machines that are using 1-gigabit network adapters. VMQ is enabled on a Hyper-V virtual switch only when the system is using 10-gigabit or faster network adapters. This means that by disabling VMQ on the Broadcom network adapter, you are not losing network performance or any other benefits because this is the default. However, you have to work around the driver issue.

Get-NetAdapterVmqQueue shows the virtual machine queues (VMQs) that are allocated on network adapters. You will not see any virtual machine queues that are allocated to 1-gigabit network adapters by default.

Sigh. I hope Broadcom are quicker about releasing a fix than Emulex (customers are waiting 10 or 11 months now?).

61 thoughts on “KB2986895 – VMs Lose Network Connection on WS2012 or WS2012 R2 Hyper-V When Using Broadcom 1GbE NICs”

  1. Hi Aidan, I have a 3 node Hyper-V cluster up on WS2012 with only Intel 1 gigabit adapters. Here’s Get-NetAdapterVmq output from one of the nodes – http://1drv.ms/1kiEMuG and VMQ is enabled. Fortunately I’m not having any issues described above with Broadcom adatpers, but I’m curious why is that ๐Ÿ™‚

    Dusan

  2. Broadcom Network adapters have historically been hit or miss. At my work, ESXi boxes have had issues with them, we have had Windows 2008 R2 SQL/Exchange servers that have issues and all those issues would clear up when replaced with Intel/Cisco cards.

    We finally got to point where we just order Intel cards in all new Dell boxes and call it a day.

  3. FYI We’re experiencing similar symptoms but with 10Gb Broadcom adapters and some of our VMs. It seems that older OS i.e. 2008 R2 and prior are affected. Restarting the guest OS doesn’t resolve anything but switching the network adapter to DHCP then set back again its static IP resolves the issue for us for some odd reason. Looks like something wonky is happening in the integration services/drivers for the vNIC.

    We haven’t tried disabling VMQ yet on the physical adapters. Maybe something to explore.

  4. That would explain a lot… I had that issue a few times with my NICs integrated in Dell R420 :/ I’ve had them in a team in WS2012R2 cluster node and I would loose all network connectivity to all VMs and the node. Though I did not need to reboot the server – in my case I had NICs in team as active-backup and I just needed to change active NIC to restore connectivity (logged in with idrac ๐Ÿ™‚ ). I see it’s a good thing we switched to Intel NICs.

    btw. I had another NIC in server for cluster only network and I wondered why that was working…

  5. We are currently experiencing this issue and I was going to disable VMQ this weekend.

    Do you know if the command needs to be run on the NIC team or just the NIC’s that are members of the team?

    Any advice appreciated.

      1. Hi Aidan

        Thanks for taking the time to reply I successfully disabled VMQ on the NIC’s.

        So far no loss of connectivity so fingers crossed!

        Again thanks for replying ๐Ÿ™‚

    1. The advice from MSFT is to disable VMQ on NICs less than 10 Gbps. In fact, I believe some of them were surprised that VMQ is enabled on some 1 GbE NICs by default.

  6. I am running 16.6.0.4 and still have the problem when running VMQ. Am trying to run VDI under XenDesktop and half or more of the the VMs keep dropping off the deliver group (“Unregistered”) after successfully registering initially. I will try disabling VMQ on all pNICs like you said. 16.6.0.4 is still the latest available driveer from Broadcom, correct?

    1. The guidance from MSFT is to disable VMQ on 1GbE NICs. In their opinion, it shouldn’t be on in the first place.

      1. I see. Do you have a reference for that? I was looking at best practices here http://blogs.technet.com/b/askpfeplat/archive/2013/03/10/windows-server-2012-hyper-v-best-practices-in-easy-checklist-form.aspx

        and it simply says “VMQ should be enabled on VMQ-capable physical network adapters bound to an external virtual switch.”

        What I am seeing now that VMQ is disabled is that CPU usage is thru the roof, and my VDI density is 50% of what it was with 1GB Intel NICs with VMQ enabled.

  7. Updated 2 hosts (Drivers – Firmware etc) a few weeks ago and have been seeing this random issue since. Thankfully it was only affecting one port on the NIC that was on the vNIC, and it wasn’t a critical port. Access could be restored by flipping the Port on the switch.

    Upon using Get-NetAdapterVmq – I could see that the problematic port was the only one with VMQ enabled. I’ve now disabled VMQ for that and will report back the results…….

  8. Hi ,
    I have nic 1 & 2 are in teaming in hyper v core os . team name is i have to write in command or just disable the 2 nic one by one?

  9. I m experience same behavior in one windows 2003 R2 VM on Hyper v 2012 R2 Failover Cluster . VM nic loose connectivity , VMQ is diasble on Physical Nic please advice ..

  10. Hi

    Please suggest , we have deployed Windows server 2012 R2 termail services on hyper v 2012 R2 Failover Cluster , we have facing slow login issue on user side , stuck on applied group policy , some hangs ….VMQ is diable on Physical NIC ..
    Your quick response is highly appreciated

  11. Hey Aidan,

    Yea I’m having this on the HP Proliant DL360p GEN 8 server. It’s the exact driver file name as highlighted on the article above. It’s killing me, I thought it was something I had screwed up on the config but it has to be this issue!!… I hope!

    Are there any other drivers available yet that I can install does anyone know?

    I will disable VMQ in the morning when I get in…. fingers crossed this will sort it

    Thanks for the article …. hopefully this will save my life ; )

    Kevin

      1. I can report that I’m running 16.8.0.4 and still experience this issue. I’ve disabled VMQ today hoping this resolves the issue here.

  12. Hi Aidan,

    Same situation here, disabled VMQ on the physical NIC however curious as to what affect that has on throughput. There should be still some offload with RSS remaining enabled?

    With 1GB NICs what is the effective throughput we should be seeing between VM on separate hosts?

    David

  13. Thanks for the tip. We’re got three Dell PowerEdge R520’s with Broadcom NetXtreme 5720 dual port adapters running Windows Server 2012 R2 and Hyper-V. We’ve been experiencing intermittant network connectivity issues and occassionally the host and one or more guests would lose connectivity to the network, even though everything seemed to be okay.

    After reading your article and Microsoft’s KB2986895 article, I decided to give your fix a go, using the following PowerShell commands:

    Set-NetAdapterVmq -Name โ€œNIC1โ€ณ -Enabled $False
    Set-NetAdapterVmq -Name โ€œNIC2โ€ณ -Enabled $False
    Set-NetAdapterVmq -Name โ€œNIC Teamโ€ณ -Enabled $False

    I haven’t had a problem since – touch wood – and it’s been almost a week.

    Our NICs are all running the latest firmware and drivers as of now (25/05/2015)… unfortunately, Dell/Broadcom haven’t fixed the issue yet, but since I don’t notice any downside to switching VMQ off and our servers are now running without networking issues, I’m no longer bothered.

    1. Good. There is no downside to disabling VMQ on 1 Gbps NICs – Microsoft has told the OEMs to do this and they were ignored.

  14. Hi
    We’ve had the same issue with our 7node HV Cluster of 2012R2 -all with 1Gbit Broadcoms. First I’ve disabled TCP Offload on all NICs and that kinda fixed the issue until now (2 months later). As it seems the only enabled interfaces were two out of 3 designated for VM traffic.

    For all those that have more than 1 host and would like to do it remotely here’s how – replace HVHOST1 atc with your HV host name.

    $HVHosts = (“HVHOST1″,”HVHOST2”)

    Invoke-Command $HVHosts -ScriptBlock {
    Get-NetAdapterVmq | % {
    if ($_.Enabled -eq ‘True’) {Set-NetAdapterVmq $_.Name -Enabled $false}
    }
    }

    Thanks Aidan for the information. Will report back in a while to see if that work longterm:)

    BTW. here’s my set of “disablers” for HV Host on 1 Gbit Broadcoms. I have 4 built in NICs (Named NIC1-4) and two PCI slots with 4 nics each named PCI1_1 to PCI1_4 and PCI2_1 to PCI2_4). They differ with feature set so different commands for each group. Also disabling that on Microsoft teaming interfaces.

    $NICs = Get-NetAdapter -Name NIC*
    foreach ($NIC in $NICs) {
    Set-NetAdapterAdvancedProperty $NIC.name -RegistryKeyword *LsoV2IPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $NIC.name -RegistryKeyword *LsoV2IPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $NIC.name -RegistryKeyword *TCPConnectionOffloadIPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $NIC.name -RegistryKeyword *TCPConnectionOffloadIPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $NIC.name -RegistryKeyword *TCPUDPChecksumOffloadIPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $NIC.name -RegistryKeyword *TCPUDPChecksumOffloadIPv4 -RegistryValue 0
    }

    $PCIs = Get-NetAdapter -Name PCI*
    foreach ($PCI in $PCIs) {
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *LsoV1IPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *LsoV2IPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *LsoV2IPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *UDPChecksumOffloadIPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *UDPChecksumOffloadIPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *TCPChecksumOffloadIPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *TCPChecksumOffloadIPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $PCI.name -RegistryKeyword *IPChecksumOffloadIPv4 -RegistryValue 0
    }

    $Teams = Get-Netadapter -InterfaceDescription “*Multiplex*”
    foreach ($Team in $Teams) {
    Set-NetAdapterAdvancedProperty $Team.name -RegistryKeyword *LsoV2IPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $Team.name -RegistryKeyword *LsoV2IPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $Team.name -RegistryKeyword *IPChecksumOffloadIPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $Team.name -RegistryKeyword *UDPChecksumOffloadIPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $Team.name -RegistryKeyword *UDPChecksumOffloadIPv6 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $Team.name -RegistryKeyword *TCPChecksumOffloadIPv4 -RegistryValue 0
    Set-NetAdapterAdvancedProperty $Team.name -RegistryKeyword *TCPChecksumOffloadIPv6 -RegistryValue 0
    }

    Not using TCPv6 to V4 so disabling that as well.

    Set-NetTeredoConfiguration -Type Disabled
    Set-NetIsatapConfiguration -State Disabled
    Set-Net6to4Configuration -State Disabled

    Disabling VMQ on all enabled interfaces:

    Get-NetAdapterVmq | % { if ($_.Enabled -eq ‘True’) {Set-NetAdapterVmq $_.Name -Enabled $false} }

  15. I’ve got two Dell PowerEdge R530 server with Broadcom NetXtreme 5720 dual port adapters running Windows Server 2012 R2 and Hyper-V. I have updated to the latest firmware (7.10) and driver (the “19.0 update”, which shows driver version “17.0.0.3” in the Windows device manager after being installed) published on Dell’s support web site (Release date 07 Apr 2015). If I don’t turn of VMQ by using the ‘Set-NetAdapterVmq -Name โ€œNIC1โ€ณ -Enabled $False’ PowerShell command, the adapter goes deaf until reset if put under load (such as copying large files across the network), exactly as described in KB2986895.

    So it appears I can confirm that the latest Broadcom drivers don’t fix this issue. Fortunately, the disabling of VMQ described in KB2986895 still resolve the issue (at least for my configuration).

  16. Hmm, disabling VMQ with the powershell doesn’t seem to have any effect. Should try to disable the tcp offload and other settings as scripted above. Note to mention that with 10gb Qlogic extremes the driver side will still have the VMQ enabled. Disabling this doesn’t help either. Just constant loss of connectivity on the host side.

  17. Is the command set-netadaptervmq -name “…” -enabled $false permanently or must I repeat that after each server restart
    ?

    mfg

    Uwe Weih

  18. Just another update for people still having this problem – I’m running a five-node cluster on HP DL380 Gen9 servers with these NICs, and we’ve updated through three different versions of the driver that claim to fix this problem (16.8, 17.0, 17.2), but to no avail. I’m now disabling VMQs on the physical NICs, so hopefully I can finally draw a line under it.

  19. Just FYI- I opened a support incident with Microsoft yesterday regarding this very issue. I’m running a cluster of 8 HP DL360p Gen8 and 9’s (2012 R2 Core), using the HP Emulux 10gb adapters, newest drivers and firmware, all Winders updates are installed as well as MS recommended HV-specific patches. MS had me disable VMQ this morning, so far so good. I just wanted to let you know this isn’t a Broadcom exclusive issue.

    1. The Emulex issue is well known – see Hyper-v.nu for more. Have you deployed the latest drivers and firmware for your NICs from HP? This was supposed to have been fixed earlier this year by them. There was also an update via Windows Update to fix the last little bits in the MSFT networking stack.

      1. Aidan, thanks for the reply. As long as I got you, I’d like to thank you for all the help your site has given me over the years.

        Yep, my NIC driver/firmware is current, and I have all Windows updates installed & current.

      2. You wouldn’t happen to have the exact URL on Hyper-v.nu, would you? I dug around and couldn’t find anything emulex specific.

  20. Hi,

    One of my Windows Server 2012 r2 VM is also having same issue. From the VM some IP’s are accessible & other’s are not. Having Intel I350 Gigabit adapter on host server.

    Tried to disable VMQ on this VM but that also didn’t resolve this issue.

    Dhiraj

    1. Disable VMQ on the physical NICs. Also ensure that drivers & firmware are up to date for the physical NICs and that the host is patched.

  21. I has this issue a year ago. I have updated the latest Broadcom drivers around 5 months ago and I was hopefull this would resolve things. I’m running the HP DL360 Gen 8 server. Lastnight.. boom! VM’s just disappear from the network. This is killing me. I will try disabling the VMQ on the physical NIC’s ( 4 of them ). Do you need to disable VMQ on the VMs also? I thougt someone mentioned that? Do I need to reboot? Please help.

  22. We run a lot of ProLiant servers with these Broadcom NICs. Most of the servers are core installations (without GUI) and we never had any connectivity loss. However, we installed a ProLiant ML350 gen9 with GUI and the NICs lose from time to time the IP connectivity.
    We disabled VMQ and will see what happen.
    Thank you for the advice.

  23. am seeing a similar issue on a HP Proliant dl380 gen 9.
    but with no BRoadcom nic (that i can see)
    2 types of NICS
    HP IGB Nics 4port 33li (HP NIC)
    and
    HP Ethernet 10Gb 2 port 561FLR (Intel nic)

    have converged networks setup with powershell and Vlans in place.
    NICS connected to cisco switch (Switch configured to use a LAG), i still have the team in switch independent team mode and dynamic load balancing.

    and when the connectivity is an issue the 2 Windows clusters fail to communicate to eachother at all until they are rebooted.

    not sure if its a network hardware/config issue or a windows issue.

  24. sorry, i can actually see that even though its a hp NIC, it is actually using the b57 driver
    (b57nd60a.sys 17.2)

  25. Hello guys,

    we are experiencing a somewho similar behavior on our HP ProLiant ML350 Gen9 using Broadcom NICs. We updated firmware and driver (now v20.6.0.4) just two weeks ago to make sure everything is fine. We are running Hyper-V 2012 R2.

    Strange behavior is: Our Windows domain controller (also DNS+DHCP-Server) loses internet connection every day randomly. Fix: Restart the VM. Fun thing: The connection to the internal network works just perfect. As soon as the DC is off the internet, all the machines connected to the domain are offline, too.

    Any ideas on that?

    Regards
    Daniel

  26. Broadcom and Microsoft say it was fixed in driver version 16.8. Well, I’m having this exact same issue with HPE ProLiant Gen9 and Gen10 servers, running either Windows Server 2012 R2 or 2016, most of them running version 17.4 of the Broadcom driver or the equivalent version 214 of the HPE driver. I’m going to try disabling VMQ on the pNIC’s and vNIC’s.

  27. Hi I am using hp dl380 gen 9 Broadcom nic Windows server 2012 r2 update to Windows server 2019 after the update hyper v not ping to the external network even vmq disable in the network card kindly support me

Leave a Reply to AFinn Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.