Cloud Computing|Microsoft Azure

Microsoft Azure AD Outage Highlights Upcoming SLA Updates

If you had trouble yesterday accessing many of Microsoft’s services, you are not alone. For several hours, late into the afternoon on the East coast, Teams, Azure AD, and many other services were inaccessible.

While outages are infrequent, they do happen with Microsoft 365 and each time this occurs, the company will post a triage of the root cause. In this instance, it was the rotation of security keys that sparked the fire that took down the services.

The short version is that Microsoft, on a scheduled frequency, rotates keys used to support Azure AD’s interactions with OpenID and other standards for cryptographic signing operations. Because of a “complex cross-cloud migration”, one such key was marked ‘retain’ which means that it should not be pulled out of operation.

You can probably see where this is going but that key was not retained and was pulled from operation with the net impact of many services no longer being able to authenticate correctly and taking down the services. This outage occurred because of a bug in the functionality to keep the single security key in rotation longer, not because of any outside threat.

Sponsored Content

Read the Best Personal and Business Tech without Ads

Staying updated on what is happening in the technology sector is important to your career and your personal life but ads can make reading news, distracting. With Thurrott Premium, you can enjoy the best coverage in tech without the annoying ads.

The other thing to point out here is that a similar incident occurred back in September and the company committed to improving the protection envelope around Azure AD services and more specifically, the backend to prevent issues like this from happening. At this time, those enhancements are not done rolling out but if they had been, they could have prevented this outage – look for the complete rollout to be finished by mid-2021.

Of course, the biggest issue for Microsoft is that they have SLAs that they must meet, and starting April 1st, the company will raise the public SLA to 99.99%, and more than likely, this outage would have tripped the circuit breaker on that agreement. Of course, being able to jump through the hoops to receive credits for downtime can be complex and a barrier to holding the company to its SLA.

Knowing the above, I’ll be curious to see if Microsoft postpones the SLA update until the rollout of the new backend updates are complete or if they will stick to their current plans.


Don't have a login but want to join the conversation? Sign up for a Petri Account

Comments (1)

One response to “Microsoft Azure AD Outage Highlights Upcoming SLA Updates”

  1. bluvg

    The big question that comes up after these issues is simply: why not the weekend??

Leave a Reply

Brad Sams has more than a decade of writing and publishing experience under his belt including helping to establish new and seasoned publications From breaking news about upcoming Microsoft products to telling the story of how a billion dollar brand was birthed in his book, Beneath a Surface, Brad is a well-rounded journalist who has established himself as a trusted name in the industry.

Download this eBook!

External Sharing and Guest User Access in Microsoft 365 and Teams

his eBook will dive into policy considerations you need to make when creating and managing guest user access to your Teams network, as well as the different layers of guest access and the common challenges that accompany a more complicated Microsoft 365 infrastructure. The eBook will also outline some of the major decision points across four general-purpose guest access policy scenarios for how an organization can set this up with standard licensing.

Download Now

Sponsored By