Hyper-V Replica is designed to allow businesses to continue operations after disasters occur, and more impressively, move services to the disaster recovery (DR) site before a forecasted event happens. No business continuity plan (BCP) is considered reliable unless it is tested. Hyper-V Replica provides the ability to test the IT side of the BCP by performing a test failover to the secondary (or DR) site without affecting services that are provided in the primary (or production) site. Today I will explain the three types of failover in Hyper-V Replica and when to use them.
Certain disasters are predicted. Sometimes the weather forecast gives us a few hours notice, and sometimes we even get a week to prepare. Hurricane Sandy formed in the western Caribbean Sea on October 22, and didn’t reach the southeastern United States until October 25. The governors of North Carolina and Virginia declared a state of emergency on October 26. Other states in the northeast quickly followed as they realized this storm was going to be bad.
This is exactly the sort of situation in which you implement a Hyper-V Replica planned failover – when you know a disaster is coming and you want to avoid extended disruption to services, or worse, you want to avoid your business being terminated. Once a decision to evacuate is made, you initiate the below process, and within minutes you can get out of harms way.
What happens during a planned failover:
Planned Failover in Hyper-V Replica.
I deliberately chose Hurricane Sandy as an example. There is a story of how two early adopters of Windows Server 2012 Hyper-V survived the disaster because of Hyper-V Replica.
An alternative use of a planned failover is what I like to think of as a “stretch Quick Migration.” In this situation, there is no disaster; you just want to move VMs to another location with minimal downtime. This could allow you to phase out a branch office computer room or even migrate VMs to a service provider’s public cloud or hosted private cloud.
Every IT person dreads a phone call at 3:00 AM because it’s never good news, unless you have an amazing overtime plan. Overtime! There’s no overtime in IT! Usually it’s because a server is down or an executive on the other side of the planet cannot log in. But it could be because the data center or a branch office has burned down or been flattened by an earthquake. The business needs to recover services immediately.
The correct Hyper-V Replica response here is to perform an unplanned failover. The clue is in the name: the disaster was unexpected. The requirement here is that the primary site VMs are no longer running. This avoids potential corruption. You perform the unplanned failover using the replica VMs in the secondary site. The VMs boot up using the last asynchronous replication with a maximum data loss (RPO) of five minutes.
Replication is not automatically reversed. It’s hard to reverse replication to an office that is now a pile of rubble or smoldering ashes. However, once a new production site is built, you can configure replication to the new host/cluster and perform a planned failover when you are ready.
Unplanned Failover in Hyper-V Replica.
The people, processes, and technology of a BCP must be tested regularly, but you cannot bring down production systems once a quarter, half year, or even annually to do this – the cost of testing a BCP that might never be implemented could be more expensive than the disaster! Hyper-V Replica allows you to test the technology element of a BCP by performing a test failover on the replica VMs in the secondary site:
This process, done only in the secondary site, of using cloned VMs with differential disks linked to the replica virtual machine’s disks means that the test VM is created nearly instantly. There is no waiting to perform the test. The space consumed by the test is minimized. You can perform the test without impacting either:
Test Failover in Hyper-V Replica.
Another use of the test failover is to create an isolated clone of a production VM for alternative testing. Maybe an administrator wants to test the upgrade of an operating system or application as well as the rollback for a change control process? Or maybe a tester or developer wants to troubleshoot an issue in a production system with production data without impacting services? All this can be safely done in the sandbox that is created by a Hyper-V Replica test failover.