Architecting Availability Zones for Azure VMs

This is all you need to know about Azure Stack architecture [Image Credit: Lenovo]
In this post, I will discuss some architectural elements that you will use if you wish to deploy services across availability zones (in preview at the time of writing this article) within a single Azure region.
 

Higher Levels of Fault Tolerance

If you deploy a virtual machine solution in a valid availability set design, then your deployment will qualify for a financially backed 99.99 percent SLA from Microsoft. Microsoft has created availability zones from a single region; the data centers in different availability zones do not share dependencies. For example, the Azure West Europe region is split into 3 availability zones. Each availability zone is one or more buildings that share redundant power, cooling, etc. Zone 1 has no shared dependencies with Zone 2, or with Zone 3. This means that if a single zone has a local failure, the other two zones remain online.
What is a valid availability set design? An example of a valid design is where a web farm, made up of several virtual machines hosting the same web content, is deployed across several availability zones (more in a moment) in a region.
An example of an invalid design is when you have a domain controller in one availability zone and an RDS session host in another availability zone.

The Components of an Availability Zone

Today, the following Azure resource types are aware of and support availability zones:

  • Linux virtual machines
  • Windows virtual machines
  • Virtual machine scale sets (VMSS)
  • Managed disks
  • Standard tier load balancer
  • Standard tier public IP address
  • SQL Database

One can build a deployment from the above components and spread that deployment across availability zones.
Note that when you use availability zones with virtual machines, you do not use availability sets.

An Example

The below illustration depicts an availability zone design. A single virtual network will be created for the application. The virtual network spans the zones, and any traffic traveling between the zones will be charged for at the normal VNet Peering rate within a region ($0.01/GB RRP).

A simple availability zone design [Image Credit: Aidan Finn]
A Simple Availability Zone Design [Image Credit: Aidan Finn]
 
Services will be shared to the public via two elements:

  • Standard tier public IP address
  • Standard tier load balancer

Both of the above are new tiers for old Azure resource types. The old, Basic tier public IP address and load balancer (which are free) cannot be used with availability zones. The paid for Standard tier public IP address and load balancer must be used instead. This means that even though your virtual machines will be spread across multiple availability zones, which could be across town from each other depending on the region, they will be presented to the Internet via a single IP address.
You can then deploy virtual machines as normal. For example, you might deploy a small web farm with 3 virtual machines:

  • VM01 in Zone 1
  • VM02 in Zone 2
  • VM03 in Zone 3

If you are in a larger organization with the reasons and means to take advantage of virtual machine scale sets (VMSS), then you have two possible strategies:

  • Create one VMSS per availability zone
  • Deploy a single zone-redundant VMSS

Disaster Recovery

Availability Zones, multiple data centers in an Azure region, are an extension of the concept of high availability that is provided by Availability Sets, spreading virtual machines across a single compute cluster in a single data center. But this is still about high availability and not disaster recovery (DR). If your business/customer requires DR for its Azure services, then you’ll need to replicate the workloads to another Azure region using (probably) a combination of:

An illustration of disaster recovery replication with availability zones [Image Credit: Aidan Finn]
An Illustration of Disaster Recovery Replication with Availability Zones [Image Credit: Aidan Finn]
 
Based on some text I have read, Microsoft expects that some customers will replicate services from an availability zone deployment in one region to a single zone deployment in another region. The concept here is that you’ve already taken a beating in the production site, so what are the odds of Murphy’s Law applying in the DR site too? I guess we’ll find out!

Summary

Availability zones might sound complex but in reality, it is pretty simple to deploy; you just need to realize that a virtual network spans your data centers (with the cost of VNet peering between the data centers). I’m guessing this is used under the covers and that you need to use the Standard Load Balancer and Standard Public IP address.