In this post, I will share the results of some stress tests that I ran on the Azure virtual machines with and without the Accelerated Networking feature enabled.
I deployed 4x Azure Resource Manager (ARM) virtual machines in a single virtual network, all connected to the same single subnet. Each virtual machine had a single NIC with a public IP address. All were deployed with Standard tier managed disks. A single network security group was assigned to the virtual network.
The four virtual machines were:
The purpose of having two sets of machines was:
To make life easier for the tests, the Windows Firewall was disabled in the guest OS of each virtual machine.
The last configuration was to ensure that all the machines were spread onto different hosts; this was achieved by creating the machines in a single availability set. This ensured that the packets had to hit a physical network instead of being routed end-to-end inside of a single host:
To run the tests, I used the free tool from Microsoft, NTTTCP.EXE. This tool will stress the network connection by sending data to a receiver. Bandwidth and throughput are measured; unfortunately, as you’ll learn later, latency and jitter are not.
The x64 executable for NTTTCP was copied onto the sender and receiver and was executed on both.
The sender used:
.\NTttcp.exe -s -m 16,*,<IP address of receiver> -l 2M -a 16 -t 60
The receiver uses:
.\NTttcp.exe -r -m 16,*, <IP address of receiver> -rb -2M -a 16 -t 60
My first attempt at running this test was done by deploying the above virtual machines with the DS4_v2 size. This machine (8 cores) should be capable of reaching 6000Mbps, or roughly 5.8Gbps.
In my first test run, I ran the test between the machines with Accelerated Networking enabled, vm-petri2-an1 and vm-petri2-an2. While the test was running, I opened Task Manager to view the bandwidth. As hoped/expected, a bandwidth of 5.7Gbps to 5.8Gbps was being achieved. I was very happy! The test results are shared below:
On to the machines without Accelerated Networking. Identical tests were run from vm-petri2-sw1 and vm-petri2-sw2. And the bandwidth achieved was roughly 5.8Gbps … wha-chu talkin’ ‘bout, Willis? Here are the results from that test. The results are roughly the same:
I was extremely surprised by this. I expected to see much lower bandwidth from the virtual machines without Accelerated Networking and my first (wrong) reaction was to think, “what is the point of Accelerated Networking?” I reached out to the MVP (Microsoft Valuable Professional) and Microsoft community and was reminded that Accelerated Networking (like Hyper-V SR-IOV) is about more than just bandwidth.
In short, the DS4_v2 won’t show bandwidth improvements, but what my test didn’t and couldn’t show was the CPU utilization improvements at the host level and reductions in latency and jitter that would have been achieved.
Unfortunately, Windows Server doesn’t have a tool for measuring latency. Linux has qperf (which Microsoft has used in demonstrations) but I’m Linux-disabled! One might say, “try Ping” but the ICMP protocol is not optimized by Accelerated Networking. Latency testing using Ping would be useless.
So without a means to measure latency/jitter, I decided to upgrade my lab from DS4_v2 to the $2.43/hour or $1,778.76/month DS15_v2 (North Europe, RRP pricing).
I shut down the four virtual machines, and resized them to DS15_v2, capable of up to 25,000Mbps or approximately 24.4Gbps. Then I re-ran the tests, starting with the machines that had Accelerated Networking enabled.
As the test ran, the bandwidth achieved fluctuated between 23.9Gbps and 24.4Gbps, normally sitting at around 24.1Gbps; I was happy that the numbers from the official virtual machine sizing documentation were valid. The NTTCP results were as follows:
I then ran the same test between the two machines without Accelerated networking. While the test was running, the bandwidth fluctuated between 17.9Gbps to 19.6GBps but was regularly around 18.1Gbps. The results of the test are shown below:
When we compare those results:
28 percent seems to be a trend, so we can safely assume that enabling Accelerated Network, at no extra cost, improved the bandwidth/transmission performance of the DS15_v2 virtual machine. What I don’t know, is how latency, jitter, and host CPU were improved.
We can see in the results that CPU utilization went from around 18.5% percent to around 30 percent by pushing this traffic with Accelerated Networking enabled. That should be expected because a lot more interrupts are being handled in the guest OS. If that worries me, in a HPC scenario, then I’d look at one of the “R” enabled virtual machines (H-Series) that has an extra RDMA-capable NIC.
Enabling Accelerated Network when you create the NIC and virtual machine does have a positive effect. I typically turn it on if the virtual machine size supports it. The results might not be obvious in the machine sizes that I normally deploy or price up for customers, but they are there. Maybe if I can work up the courage, I’ll have another go at these tests with some Linux virtual machines where I can test latency using qperf.
If you are deploying larger machines, then enabling Accelerated Networking is a must-do. After the recent deployment of the Intel Meltdown security flaw mitigation in Azure, Microsoft even recommended enabling Accelerated Networking (redeploy the machine using existing disks). If you experienced a drop in performance, you benefit by the host having to do fewer context switches between User and Kernel mode when the flawed feature bypass impacts performance.