Office 365 Suffers Temporary Scaling Problems
Features Adjusted to Ease Demand
A spike in demand on Office 365 services, probably linked to the upswing in home working provoked by the COVID-19 virus pandemic, has forced Microsoft to throttle back some application features. The news was released to tenants in an update posted to the Office 365 message center yesterday (Figure 1).
A Rash of Teams Issues
The features called out in the notification point to performance issues in Teams and Skype for Business Online messaging and video calls. Microsoft isn’t confirming the truth of this suspicion, but the feeling is supported by a rash of problems experienced by Teams users over the past few days. Common symptoms include:
- Not being able to connect to online meetings or needing several attempts to connect before successful.
- Not being able to send or receive messages or schedule meetings in the Teams client.
- Lack of responsiveness in the Teams admin center.
- Teams Live Events losing the ability to “go live” (broadcast).
Because Teams service is distributed in multiple Office 365 datacenter regions around the world, the experience of users varied. European users reported a two-hour loss of service in the morning while people in other regions kept on working. However, problems resurfaced when U.S.-based workers came online. As the day progressed, Microsoft controlled the situation, possibly by introducing the measures to reduce the impact of “non-essential capabilities.” Figure 2 shows that the incident peaked around 2:40PM UTC and reduced thereafter (data from downdetector.com).
I didn’t notice many issues because I was connected for most of the day to Microsoft’s own tenant for the online MVP summit. This event worked remarkably well across a mixture of live events and online meetings. I expect that the Teams engineering group was on full alert to make sure that things went smoothly. In passing, while we’re all working from home, you can help performance by turning off video during Teams meetings. If you insist on having video on, remember to blur your background.
Say Goodbye to Traditional PC Lifecycle Management
Traditional IT tools, including Microsoft SCCM, Ghost Solution Suite, and KACE, often require considerable custom configurations by T3 technicians (an expensive and often elusive IT resource) to enable management of a hybrid onsite + remote workforce. In many cases, even with the best resources, organizations are finding that these on-premise tools simply cannot support remote endpoints consistently and reliably due to infrastructure limitations.
SharePoint and OneDrive Problems Too
However, I have noticed that other Office 365 applications have become very sluggish over the last week. SharePoint Online and OneDrive for Business are both slow to list items in document libraries, and Microsoft has posted advisories to warn users that search results are stale because background indexing isn’t happening as it should. Unless they accessed the SharePoint and OneDrive for Business browser interfaces, end users might have unaffected by the slowdown because the OneDrive sync client hid the issue.
Teams Scaling Fast
To be fair to Microsoft, Teams is a hugely successful service that has been asked to take on massive new load over the recent past, particularly as the education sector scaled up for online classes. Taken together with the demand for more online meetings and audio calls, you can see how Microsoft might have been surprised by the surge in demand. Although they’ve been building out the Teams service in line with its rising numbers of users, I imagine that a pandemic wasn’t part of their modelling process.
I expect the number of Teams daily active users to have a huge burst from the November 2019 figure of 20 million the next time Microsoft reports. It’s a huge logistical task to commission the global infrastructure to support this kind of growth, and it hasn’t been helped by recent events. A return to normality would be nice.