Depending on what part of the country you’re in, coping with power outages and having backup power may or may not play a big part in your disaster recovery plans. While it’s often overlooked, power is the one thing that all computing activities require, whether it is desktops, datacenters, or cloud computing.
If the power goes out, there is no computing until it is returned. Let’s take a closer look at some of the different types and sources of power problems and then what you can do to help protect against them.
Power outages can come from a wide range of different sources, and they can also extend from the level of an individual server to an entire geographical region. Failures at the individual server level are fairly common. They can be caused by external factors, like a disruption of power, to internal factors like a failed power supply.
Next, power outages at the site level are less common. They can be caused by failure of a building power infrastructure or an external disruption like an accident involving incoming power lines. One example that sticks out for me was my experience in a manufacturing plant based in southern California.
The area this plant was in was well known for periodic brown outs. To combat this problem, and keep the plant’s manufacturing equipment running during these brown outs, the management purchased a high-powered generator. It was attached to the plant’s incoming electrical feed.
Understandably, management was primarily focused on keeping the production machine running. But they didn’t consider how this new power source might impact their computer equipment. Like you might guess, the first time a brown out hit, and the generator kicked on, the plant did keep running. However, the massive power surge blew out virtually all of the local computer equipment – even the surge protectors, making a new kind of power outage where the site equipment needed repairing or replacing.
If nothing else, this underlines the importance of testing your DR procedures.
Geographical outages are rare but they do happen. In 2020, 1.33 billion hours of power outages affected the United States. According to NPR, a grid reliability report says that in 2022 power outages are likely in parts of the Midwest, California, and Texas.
Currently California and Puerto Rico are the leaders in power outages. Most power outages don’t last very long, with many ending seconds or minutes after beginning. However, sometimes outages can last a very long time – sometimes multiple days.
Site outages are most often caused by weather conditions but they can also be the result of disasters like hurricanes. Or even human error like the instance on the east coast where a repair crew accidentally dug up one of the key power lines for the region.
Public cloud solutions allow businesses to access IT services without having to manage and operate the underlying infrastructure on-site. However, don’t think that moving to the cloud completely protects you from power disruptions. Because cloud providers have been known to experience power outages as well.
All of the major cloud providers experienced outages during 2021. One of the ways that you can find out if you’re impacted by a cloud outage is using the DownDetector web site. The cloud can protect you from individual server failures for virtual machines (VMs) that run in the cloud. But it does not offer the same protection from local site or regional failures.
While your cloud resources may be running along fine, if you do have a site or regional failure, then you cannot access those cloud resources unless you have a backup site or facility that is outside the affected area. It is possible for an entire cloud region to go down as well.
At the individual server level, most enterprise level server units, like those from HP or DELL, offer built-in redundant power supplies. Plus, you would then want to couple these servers to an industrial or commercial level UPS system. That gives you a degree of server level protection.
For site-level protection, you can look at the power protection models used by many of today’s colocation facilities. Even if commercial utility power is unavailable, these types of facilities have redundancies and backup systems in place that can provide power for some time.
Colocation datacenters typically have backup generators with varying levels of redundancy and Uninterruptible Power Supplies (UPS). The UPSs are used to bridge the gap between the time that the commercial power fails and the backup generators can come on-line.
UPSs have a limited capacity, so they are only used for the computing equipment. That means the lights, cooling, and other mechanical devices experience a brief interruption until the generators kick in. The backup generators can run for several days with on-site reserve fuel. Colocation facilities typically have at least a yearly planned failover to ensure that their backup power infrastructure works.
At the cloud level, your best protection for regional outages is to utilize cross region implementations. Extending your cloud implementations with cross region failover is typically an added expense. But if successfully implemented, it can give your cloud apps protection from regional power or network failures.
In case you’re interested, you can track the number of regional power outages at powerutage.us.