Choosing an Azure Virtual Machine - December 2018
This post will explain how to select an Azure virtual machine (VM) series and size, updated to include virtual machine series that were available at this time. The post will categorize virtual machines based on general roles and series and describe features such as ACUs and the impact of size on elements of performance.
Order from the Menu
Azure is McDonald’s. It is not a Michelin Star restaurant. You cannot say, “I’d like a machine with 4 cores, 64GB RAM, and a 200GB C: drive.” That simply is not possible in Azure. Instead, there is a pre-set list of series of machines. Within those series, there are pre-set sizes.
Sizing a Virtual Machine
There are two things to consider here. The first is common sense – sizing the performance requirements. The machine will need as much RAM, CPU (see the ACU section later in this article), and disk space as your operating system and service(s) will consume. That is no different than how you sized on-premises physical or VMs in the past. You should also consider disk performance (IOPS and MB/second throughput) and network performance. Typically, the more physical cores that power a virtual machine, the more interrupts it can handle, and therefore the more disk/network throughput the machine can handle.
Say Goodbye to Traditional PC Lifecycle Management
Traditional IT tools, including Microsoft SCCM, Ghost Solution Suite, and KACE, often require considerable custom configurations by T3 technicians (an expensive and often elusive IT resource) to enable management of a hybrid onsite + remote workforce. In many cases, even with the best resources, organizations are finding that these on-premise tools simply cannot support remote endpoints consistently and reliably due to infrastructure limitations.
The other factor of cloud-scale computing is that you should deploy an army of ants, not a platoon of giants. Big virtual machines are extremely expensive. A more affordable way to scale is to deploy smaller machines. Smaller machines can share a workload. They also can be powered on/off where billing starts/stops based on the demand or you can use the Scale Sets feature.
Microsoft created the concept of an ACU to help us distinguish between the various processor and VM series options that are available to us in Azure. The low-spec Standard A1_v2 virtual machine has a baseline rating of 100 and all other machines are scored in comparison to that machine. A virtual machine size with a low number offers low-compute potential and a machine with a higher number offers more horsepower.
Note that some scores are marked with an asterisk. This represents a VM that is enhanced using Intel Turbo technology to boost performance. The results from the machine can vary depending on the machine size, the task is done, and other workloads also running on the machine.
Choosing a VM Series
Browse the HPE or Dell sites and have a look at the standard range of rack servers. You will find DL360s, R420s, DL380s, R730s, and so on. Each of these is a series of machines. Within that series, you will find a range of pre-set sizes. Once you select a series, you find the size that suits your workload. The per-hour price is charged per minute of running. You will see this price listed. Let’s look at the different series of Azure VMs. Please remember that not all the series are in all regions.
There are many virtual machine series and this can be confusing at first. But:
- The name of each series, a letter, means something. Watch out for how I highlight certain letters when I describe each series later in this post.
- The vast majority of the machines that I have seen in the real world are from 3 lower level series.
VM Series Versioning
In the server world, Dell replaced the R720 with an R730. We stopped buying R720s and started buying R730s. HPE replaced the DL380 G6 with a DL380 G7. That was replaced with the Gen 8. We stopped buying the older machine and started buying the newer machine.
The same thing happens in Azure. As the underlying Azure fabric improves, Microsoft occasionally releases a new version of a series. For example, the D_v2-Series replaced the D-Series. The Standard A_v2-Series replaced the Standard A-Series.
The older series is still available, but it usually makes sense to adopt the newer series. Late in 2016, Microsoft changed pricing so that the newer series was normally more affordable than the older one.
If you are reading this post, then you are deploying new services/machines. You should be using the latest version of a selected series. I will not detail the older/succeeded or preview series of machines in this article.
A code is normally used in the name of a VM to denote a special feature in that VM size. Examples of such codes are:
- S: The VM supports Premium SSD disks. It also supports Standard HDD and Standard SSD disks. Note that the S variants are the same price as the non-S variants – it is your choice of disk tier that varies the pricing. Also note that in the more recent series of machines, Microsoft only provides the S variants because people started to standardize on them for easier disk tier switching.
- M: The size in question offers more memory (RAM) than usual.
- R: An additional Remote Direct Memory Access (RDMA) NIC is added to the VM. This offers high bandwidth, low latency, and low CPU impact data transfers.
- I: This indicates that the virtual machine is the only virtual machine (Isolated) on the host, thus having all the motherboard’s capacity.
Microsoft has conveniently broken down the many series of virtual machines into a few categories or types. Each category helps you understand the traits/offerings of each series of a virtual machine:
- General purpose: These are the most common workhorse machines.
- Compute optimized: The focus is on processor performance.
- Memory optimized: These machines have larger amounts of RAM.
- Storage optimized: Machine of this type have special storage features.
- GPU: These are machines that have direct access to NVIDIA chipsets in the host.
- High performance compute: HPC workloads allow for large-scale computations across many virtual machines.
The following virtual machines are available:
- A_v2 (ACU 100): A is the first letter in the alphabet and is the oldest family of machines in Azure. Fine for low-end workloads, this is based on a simulated old Opteron processor that offered lots of cores with lower levels of performance. Note that this machine offers only Standard tier storage and the temp drive is only on HDD.
- D_v3 and Ds_v3 (ACU 160-190): D is for disk and D is for d This machine, powered by Intel Xeon processors (as are all of the following machines) is excellent for workloads where you need many virtual processors and high levels of disk performance. RAM amounts are typically ratioed on 1 vCPU to 4 GB RAM. For example, 2 vCPUs and 8 GB RAM, 4 vCPUs and 16 GB RAM.
- Bs: Normally called the B-Series, this is the “burstable processor series”. Microsoft artificially limits each core which has the full potential of an Intel Xeon core. If the virtual machine remains below this limit, it banks credits that can be burned to reach the full potential of the core when required. The B-Series is perfect for the common workload where CPU utilization is often low. Note that disk caching and network acceleration are not available in this series.
- DC: This new data compartmentalization series is built on new Intel SGX technology that allows sensitive code and data to be isolated from the guest OS of the virtual machine.
There is only one series in this type: the Fs_v2-Series (ACU 195-210) which is often just called the F-Series. As with the Ford pickup truck, this is a good all-rounder that can be good at many tasks. Intel Xeon processors are used. Unlike the D_v3 machines, RAM is usually lower with a more common 1 vCPU to 2 GB RAM ratio. For example, the F2s_v2 has 2 vCPUs and 4 GB RAM.
These are machines that offer more memory than usual. Some of the series are especially huge machines.
- E_v3 and Es_v3 (ACU 160-190): E is for extra RAM. The E-Series are machines that are D-Series but with extra RAM.
- G and Gs (ACU 180-240): Just like Goliath, these machines were once the biggest giants around … in the cloud. With up to 32 vCPUs and 448 GB RAM, these machines ruled. But there’s always a bigger monster.
- M (ACU 160-180): If you want massive monsters then the M-Series is for you. Armed with up to 128 vCPUs and 3.8 TB of RAM, these are BIG machines – note that some new sizes with nearly 12 TB RAM are coming!
There is a sub-category for memory optimized machines called Constrained vCPUs. Sometimes a customer chooses Memory Optimized machines purely because of large the amounts of provided RAM. These machines typically also have large numbers of processors – and that can have an impact on per-processor software licensing, such as SQL Server. The constrained vCPU machines reduce the per-processor costs by either half or a quarter of the original amount.
For example, an M16-8ms machine is an M16 virtual machine that normally has 16 vCPUs but now has 8 available to it, but with the full amount of RAM and other performance/scalability potential.
There is one series of machines in this category: the Ls-Series. These machines are special because they are optimized to use the temp drive flash capacity of the host instead of durable data disks on a storage cluster. The results of using host-local flash storage are lower latency and higher IOPS.
The Ls-Series (ACU 180-240) is based on the same host hardware (Intel Xeon) as the G-Series. However, a new Ls_v2, which has been in preview for quite a while, is based on an EPYC 7551 processor from AMD.
There are three series of machines in this category, all powered by NVIDIA chipsets. These aren’t the $200 gaming cards you can pick up in a store, but the sorts of cards that cost tens of thousands of dollars and require special licensing from NVIDIA – which Microsoft covers.
Note that ACU isn’t listed for these machines because it’s the GPU that is more important here.
- NV: These NVIDIA Tesla M60 machines are intended for desktop or application v For example, maybe you want to run Citrix/RDS desktops with a CAD application. Premium storage & caching is not supported.
- NC_v3: The NVIDA Tesla V100 GPU offers compute horsepower for workloads such as massive calculations or simulations. Premium storage and caching are supported.
- ND: The NVIDIA Tesla P40 is used for deep learning (AI to the rest of us). These machines support premium storage and caching.
Please note that you must deploy the Azure GPU extension into the guest OS to install the necessary drivers.
High Performance Compute
HPC workloads can be deployed with the H-Series machines, some of which offer additional RDMA virtual NICs for low latency and high throughput data transfers with little CPU impact. CPU performance is high, with the ACU ranging from 290 to 300. Note that premium storage and caching are not available.
Boiling it Down
The majority of machines that I have seen are:
It makes sense if you think about it; most workloads under-utilize the CPU so the “burstable” machines that offer much lower costs are perfect. The D-Series is great for database workloads and even RDS/Citrix session hosts. And the A-Series was the old entry point before the B-Series came along.
If I know a workload will demand little CPU, such as a domain controller, I am happy to recommend the Bs-Series. However, without any empirical performance data, I will suggest:
- A_v2: For low-end workloads that don’t require Premium Storage
- DS_v3: For higher CPU & IOPS demands and lower latency that can be achieved with premium storage.
Those recommendations will vary on rare occasions to some of the other series. Once a virtual machine has been running for a while (maybe a month), analyze the performance metrics to see if it can be downsized reduce costs or upgraded to increase performance.