Improved Azure VM Networking Performance (Preview)

performance-hero-img
This post will describe how Microsoft has recently started rolling out a preview of improved networking performance for Azure virtual machines.

The Announcement

Microsoft announced at the Ignite 2016 conference that it has started work to greatly improve the networking performance of the entire global fleet of virtual machines. We will see reduced latency/jitter, and we will see maximum bandwidth improvements of 33 percent to 50 percent. This is being done by leveraging hardware offloads that are available in Windows Server Hyper-V.
 

 
Note: Azure runs the same Hyper-V hypervisor that we can use in Windows Server. Microsoft’s public cloud is currently running Windows Server 2012 and Windows Server 2012 R2 Hyper-V.
The results of these improvements will benefit us in several ways:

  • More peak bandwidth: Depending on what virtual machine series/size you run, you will gain access to more bandwidth, so you can move data around more quickly.
  • Improved Storage Performance: Storage account customers will be improved storage IOPS.
  • Reduced latency: Data will get from one machine to another more rapidly.
  • Reduced jitter: This will benefit media streaming services, such as voice and video.
  • Reduced CPU utilization: Which means your services have more compute available to them.

One example that Microsoft shared is that SQL Server database In-Memory OLTP transaction performance was improved by 1.5x in testing. They also reached speeds of 25 Gbps when testing with DS15v2 and D15v2 virtual machines.

A demo of Azure networking performance from Microsoft Ignite 2016 [Image Credit: Aidan Finn]
A demo of Azure networking performance from Microsoft Ignite 2016 [Image Credit: Aidan Finn]
So how did Microsoft make these improvements? They “simply” turned on some features that move processing from the host OS (the management OS in Hyper-V parlance).

SR-IOV

By default, a Hyper-V virtual machine receives and sends traffic by routing packets from the vNIC, through the VMbus (via an integration service) to the host virtual switch, which runs in user mode in the management OS, and then to the drivers of the physical NIC, and vice versa.

SR-IOV, a feature available to us since Windows Server 2012 Hyper-V, enables a virtual function (VF – think of it as a physical NIC driver running in a VM) to connect directly to one of many physical functions (PFs) on a physical NIC in the host. By enabling SR-IOV for a virtual NIC, you can circumvent the circuitous route via the virtual switch, and allow network packets to travel directly between the VF and the PF, thus reducing latency and jitter, and potentially increasing bandwidth. This is what Microsoft has started to roll out in Azure, and what you can turn on in the settings of a virtual machine/NIC if you sign up for the preview.

Azure virtual machines switching from virtualized networking to SR-IOV [Image Credit: Microsoft]
Azure virtual machines switching from virtualized networking to SR-IOV [Image Credit: Microsoft]

NVGRE Offload

It’s no secret that:

  • Azure uses software-defined networking (SDN): This is what allows us to rapidly deploy networks on-demand without involving Azure technical staff.
  • Azure SDN is based on NVGRE: Microsoft started to use NVGRE for SDN in Windows Server 2012, and only recently added support for the more widely supported VXLAN in Windows Server 2016.

There is some amount of processing involved on the host to translate physical network traffic into software-defined network traffic, and vice versa. Microsoft added an offload for NVGRE processing in Windows Server 2012 R2 Hyper-V, enabling hosts to pass this processing to compatible NICs, including models by Mellanox which are widely used within Azure. The benefit is that additional processing is made available to virtual machines, and once again, networking performance should be improved for the host and virtual machines.

Availability

The new networking performance improvements are being rolled out worldwide by Microsoft, but you can opt into a preview program to test out the enhancements for yourself. Availability is limited at this time. The supported OS images include:

  • Windows Server 2016 Technical Preview 5 (TP5) — which should switch to the GA image in mid-October
  • HPC Pack 2012 R2 Compute Node on Windows Server 2012 R2
  • HPC Pack 2012 R2 with Excel on Windows Server 2012 R2
  • HPC Pack 2012 R2 on Windows Server 2012 R2
  • Windows Server 2012 R2 Datacenter
  • SQL Server 2014 SP1 Enterprise on Windows Server 2012 R2
  • SQL Server 2014 SP1 Standard on Windows Server 2012 R2
  • SQL Server 2014 SP1 Web on Windows Server 2012 R2
  • SQL Server 2016 RTM Developer on Windows Server 2012 R2
  • SQL Server 2016 RTM Enterprise on Windows Server 2012 R2
  • SQL Server 2016 RTM Express on Windows Servers 2012 R2
  • SQL Server 2016 RTM Standard on Windows Server 2012 R2
  • SQL Server 2016 RTM Web on Windows Server 2012 R2
  • {BYOL} SQL Server 2016 RTM Enterprise on Windows Server 2012 R2
  • {BYOL} SQL Server 2016 RTM Standard on Windows Server 2012 R2
  • {BYOL} SQL Server 2014 SP1 Enterprise on Windows Server 2012 R2
  • {BYOL} SQL Server 2014 SP1 Standard on Windows Server 2012 R2


Only 2 regions are supported at this time:

  • West Central US
  • West Europe (Amsterdam)

And only 2 high-end virtual machines are supported:

  • DS15v2
  • D15v2