How to Avoid Disaster Recovery Failures


Experiencing a disaster is one of the worst possible events for both businesses and their IT departments. However, there is one worse event – failure to recover from the disaster.  While every IT department should have a disaster recovery (DR) plan, that doesn’t mean every DR plan will work. In the best case, a failure costs the business unexpected downtime and expensive delays while they attempt to recover. In the worst case, the business could go under. That’s not just hyperbole. According to a report from the Federal Emergency Management Agency (FEMA), 40% of businesses never reopen following a disaster. And for those that do, the impact is so high that another 25% fail within one year. Disaster recovery failure is simply not an option any business wants to experience. To avoid disaster recovery failure, you should consider the following four points:

  1. Identify critical apps, data, and services to recover
  2. Include stakeholders at every level of the business
  3. Test regularly
  4. Update regularly

Let’s have a closer look at some of the reasons why DR plans can fail.

1. Improperly identifying critical disaster recovery components

The first part of an effective DR plan is identifying the different assets and their dependencies for your business-critical applications. Identifying your critical applications is typically fairly straightforward. However, today’s applications can be very complex with lots of moving parts and dependencies that can be both critical as well as easy to overlook. These dependencies can be all sorts of things ranging from local programs and services, to local files and shares, to hard-coded name requirements, or to external web services. All of these dependencies should be documented. And if you miss any of them, then your recovery efforts can be delayed or they can fail. Power is also critical. So, make sure you consider backup power supplies, Uninterrupted Power Supplies (UPS), redundant power modules in servers, and how you will test your IT infrastructure in the event of a power cut.

2. Failure to include the right people

To create a successful DR plan, you need to have the right people involved at all levels of the business. You need to have management buy-in at the top in order to get both the funding for the plan as well as the involvement from personnel in different parts of the business. While IT deploys and maintains critical applications, the applications themselves are typically managed and used by employees in various other departments. Key people from these other departments need to be involved in both the creation of the disaster recovery plan as well as subsequent testing. And of course, the actual execution of the DR plan.

3. Lack of testing

While we’re on the topic of testing, a bigger and much more common problem is the failure to actually test your disaster recovery plans. The reality of the situation is that testing DR plans is difficult and time-consuming. DR tests can take a lot of time and resources to execute. These tests are usually run infrequently and often the testing process runs into problems. That said, regular testing is the only reliable way to know that your disaster recovery plans are going to work when you need them. Several of today’s tier one DR products and Disaster Recovery-as-a-Service (DRaaS) offerings provide the ability to perform full or component testing of your DR plans, allowing you to be sure that they will work when they are needed.

4. Failure to update your disaster recovery plan

Unfortunately, creating a DR plan is not a one-time-and-you’re-finished type of endeavor. It’s inevitable that business processes experience configuration drift. Today’s IT environment is constantly changing. Applications, configurations and business processes drift and become different over time – diverging from the infrastructure that is replicated in the disaster recovery plan. The only way to accommodate these gradual changes is to periodically perform a detailed review and reevaluation of your DR plans. Periodically adjusting your DR plans will keep them up-to-date with your current business processes and avoid the DR failure of restoring old and outdated infrastructure that no longer works.

Following the 7Ps of DR planning

An old and fairly well-known adage from the British Army does a pretty good job of summing up the requirement for backup and recovery – proper planning and preparation prevents piss poor performance (7Ps). Having a strategy and an effective plan for carrying out your DR steps is certainly vital for a successful recovery. However, as we’ve seen, just having a plan isn’t really enough. Yes, you need a plan but you also have to prepare by regularly and repeatedly reviewing, updating, and practicing your DR processes. To go further, you can check out my separate article where I detail five essential disaster recovery test scenarios.

Related Articles: