Microsoft Details How and Why its Data Center went Offline This Month
When you place your data in the ‘cloud’, you are always promised one thing, increased availability. The idea is that companies like Microsoft, Amazon, and Google can build out redundancy at a scale that most companies cannot afford.
But even the best data centers in the world will go offline which is why we always preach that you should be prepared for an outage because it’s not a matter of if, but when. For Microsoft, a couple weeks back, one of their data centers went offline and now we have the full triage report of what happened.
Thanks to the company’s transparency, you can read exactly what happened here. The short version of it is that an electrical storm, after multiple repeated strikes, tripped all of the protections in place to prevent such a failure. Specifically, the cooling system inside the data center failed and as the temperatures quickly peaked above safe levels; automated shutdown procedures started running to protect the hardware inside the facilities.
The temperatures delta was so fast that some hardware was damaged by the high heat before the shutdown procedures could be completed; this is why some users experienced an extended outage as Microsoft was recovering and migrating data.
This type of an outage is one that was not directly Microsoft’s fault and despite their best efforts to prepare for an electrical storm strike, their protections failed to isolate the data center successfully. This is a good lesson in that building out a data center is not an easy task and despite our best knowledge about how to avoid disaster at this scale, we still have a lot to learn.
More in Microsoft Azure
Build 2022: Microsoft's Intelligent Data Platform Combines Data and Analytics
May 25, 2022 | Rabia Noureen
Microsoft Revises Restrictive Cloud Licensing Policies to Avoid EU Antitrust Probe
May 19, 2022 | Rabia Noureen
Microsoft's Azure AD Conditional Access Service Can Now Require Reauthentication
May 13, 2022 | Rabia Noureen
Microsoft Addresses Cross-Tenant Database Vulnerability in Azure PostgreSQL
Apr 29, 2022 | Rabia Noureen
Microsoft Simplifies IT Monitoring with New Azure Managed Grafana Service
Apr 19, 2022 | Rabia Noureen
Microsoft Adds Ampere ARM CPU Support to Azure Virtual Machines
Apr 5, 2022 | Rabia Noureen
Most popular on petri