Microsoft has urged OEMs not to enable VMQ on the standard 1 GbE NIC that’s commonly found in Hype-V hosts. But despite this request and the fact that it adds nothing. VMQ is left enabled and causes performance and uptime issues. In this article, you’ll learn why you should disable VMQ as a standard part of your deployment and configuration management.
On social media, in meetings, at community events, and even after speaking at Microsoft Ignite, I get asked a question that starts something like, “My Hyper-V hosts have a problem when <insert something to do with networking> …,” and I interrupt them.
I ask if they are using Emulex 10 Gbps converged NICs, which is the sort you find in IBM, HP, and Hitachi blade servers, or 1 Gbps Ethernet NICs in their hosts. Emulex appears to have finally sorted out the awful handling of VMQ in their firmware and drivers, and OEMs eventually dribbled out the fixes.
But most of the time, the answer is that they have 1 GbE networking from Broadcom or Intel. I usually know straight away what the fix is. I ask them if they have disabled VMQ on the physical NICs that are used for the virtual switch. “VM-what?” is sometimes the response, and other times the response is “I don’t know.”
If you’re using 1 GbE NICs, then you probably shouldn’t know or care about VMQ, because this hardware offload offers nothing to you. You’re not pushing enough traffic into your hosts to take advantage of VMQ. Typically this offload makes it possible to take full advantage of 10 Gbps or faster networking.
Microsoft, aware that VMQ offers nothing, has asked OEMs not to enable the feature by default on 1 GbE NICs. However, the OEMs have not only ignored Microsoft, but they ignore VMQ-aware administrators, too. I’ve been told many times that those administrators that are aware of the VMQ guidance for 1 GbE NICs and disable VMQ per Microsoft’s guidance, often find it re-enabled after upgrading the driver from their server manufacturer.
What makes it worse it that the drivers and firmwares for these NICs usually handle VMQ very poorly. This leads to performance issues. For example:
The cause? VMQ was enabled on these 1 GbE NICs.
As a short-term solution, nly enable VMQ on NICs where it is really required. Of course, the OEMs turn it on by default, so you’ll need to disable it in all of your deployments if you are not making use of it. VMQ offers nothing for 1 GbE connections other than bugs, so turn it off every time on every physical 1 GbE NIC on your Hyper-V hosts.
What about OEMs turning VMQ back on after you perform a driver update? The only solution here is to either waste time re-disabling it or to implement some kind of desired state configuration (DSC) management to automatically return VMQ to a disabled state on 1 GbE physical NICs.
For a long-term solution, I want Microsoft to do three things: