Microsoft Azure Disaster Recovery Replication Methods
With Microsoft making Microsoft Azure Site Recovery (ASR), also known as DR-as-a-Service or DR-in-the-cloud, a viable option for small-medium enterprises (SMEs) and branch offices as well as the large data center, we now have to start asking questions such as “what is the best way to replicate my machines and services to Azure?”. This article will ask this question and offer some answers.
Reality: There are multiple Azure DR replication solutions
It would be nice to say “just sit back and let ASR take care of everything” but the cold hard reality of IT is that every application or service has different requirements support restrictions. There’s a very good chance that a small business might be able to replicate all of their Hyper-V virtual machines to Azure using ASR, but in reality, medium-to-large and complex organizations will require 2 or more replication methodologies. You might view this as a complexity – but I like to see it as offering more interesting design possibilities.
Microsoft Azure Site Recovery Replication
If you are replicating Hyper-V virtual machines to Azure then the replication method being used is Hyper-V Replica (HVR). Hyper-V Replica is a built-in storage agnostic method of asynchronously replicating individual virtual machines, with an SSL option. Whew! That’s a lot of stuff in one sentence but it packs a lot of information into one small package.
HVR, and therefore ASR, does not care what your on-premises storage system is; it can be internal disk, DAS, SAN, SMB 3.0 or Storage Spaces. HVR is asynchronous; the benefit being that you can replicate over long distances. HVR allows you to select individual virtual machines to securely replicate to Azure over the Internet. The use of the word “individual” is a deliberate choice of mine. While this allows you to selectively replicate and keep DR costs down, it does mean that you cannot replicate n-tier virtual machines with cross-machine consistency (remember that this is asynchronous replication). So in an unplanned failover, one virtual machine might boot up as it was 2 minutes ago, and another might boot up as it was 5 minutes ago. While this might not cause any issue, for some cross-machine services this could cause lots of issues.
Say Goodbye to Traditional PC Lifecycle Management
Traditional IT tools, including Microsoft SCCM, Ghost Solution Suite, and KACE, often require considerable custom configurations by T3 technicians (an expensive and often elusive IT resource) to enable management of a hybrid onsite + remote workforce. In many cases, even with the best resources, organizations are finding that these on-premise tools simply cannot support remote endpoints consistently and reliably due to infrastructure limitations.
The reality is that ASR will be the perfect way to replicate the majority of machines to Azure. But there are going to be exceptions.
Dealing with Domain Controllers during replication
One example of an application that we make huge efforts to keep consistent is Active Directory (AD). What happens if all or some domain controllers (DCs) come online in a secondary site and restore to different points of time? Will there be replication issues? Will previously deleted items return? Will new passwords be reset to old ones? Do we need to be battling inconsistent domains and dealing with password-reset requests when we invoke the business continuity plan after a disaster?
Microsoft has published very specific guidance on how to treat domain controllers when you are replicating one site to another. I strongly recommend that you read this article, but I have distilled it into three scenarios:
- SME with 1 domain controller: Replicate this machine to Azure using ASR
- Multiple DCs: See the following hybrid design
- Partial site failover: See the following hybrid design
When you deploy an ASR vault, you must create a storage account and a virtual network, the very things you need to deploy virtual machines. Microsoft’s guidance for complex DR designs is that you deploy a hybrid network with virtual DCs in Azure being a member of your on-premises domain. Yes; you will create a site-to-site VPN or ExpressRoute connection to Azure and add the ASR virtual network to your “WAN”. The DCs in Azure will then replicate with the DCs in your primary site. This means that your AD will be present and in a consistent state should your primary site fail. If it’s an unplanned disaster (like a fire) where the primary site is permanently lost, then you can seize the FSMO roles to the Azure DCs until a new primary site can be established.
An upside of deploying these DCs in Azure is that you can also run other systems in Azure. Have you thought about how people will access your services after a failover? Will you be using RDS or Citrix? Won’t you need those systems to be sitting there (maybe powered on/off using Azure Automation for maintenance and in the Recovery Plan) permanently and need a local authentication/authorization system? Your in-Azure DCs will help there.
About Application Replication
The reason we have servers is to run services. Some of those services will be native features of Windows Server. The great news is that all services of Windows Server support being replicated by HVR (with the previous asterisk on domain controller consistency). But what about SQL Server, Exchange Server, Oracle software, and those lovely line of business (LOB) applications written by Honest Bob way back in 1999? Do any of those support being replicated to Azure. Just like with any project, there must be a discovery phase followed by an analysis phase. During discovery, you’ll find out what is there. Is the guest operating system supported by ASR? And importantly, what kind of replication does the application vendor support?
For example, the Exchange product group supports just one kind of replication: the one that they wrote inside of Exchange. If you still have on-premises Exchange then you should replicate to Azure by deploying Exchange in Azure virtual machines (check your Exchange licensing!) and using the native replication methods of Exchange.
SQL Server does support replication via Hyper-V Replica if you set a value called EnableWriteOrderPreservationAcrossDisks to 1. However, you must use the native replication of SQL Server instead of ASR if you use any of the following features:
- Availability Groups
- Database mirroring
- Failover Cluster instances
- Log shipping
That means deploying a SQL Server virtual machine in Azure and then configuring site-to-site replication via a VPN or ExpressRoute connection, just as you might with Exchange Server.
Also bear in mind that if you use any of the following kinds of storage (instead of virtual IDE/SCSI connected, non-shared VHD/X) in a Hyper-V virtual machine then you will need to use application or guest OS replication instead of ASR to replicate to virtual machine(s) in Azure:
- Shared VHDX
- Remote volumes or LUNs, such as those connected by fiber channel or iSCSI
Final ASR replication thoughts
ASR is going to meet the requirements of replicating most virtual machine to Azure. But there are going to be some exceptions where you will need to deploy virtual machines in Azure and then use application layer replication. Approaching any opportunity with a structured plan, complying with application support requirements, and understanding that you’ll always need some method to connect to a failover site will make this easier and more understandable for decision makers.