On May 14, Microsoft announced that discovery and review capabilities for labeled data and sensitive data types were generally available in the Microsoft 365 Compliance Center. Microsoft calls this “know your data,” part of their Information Protection and Governance framework. The idea is that if you understand your data, you can better protect what’s important. Or in marketing terms, “The first step in the journey to protect and govern your data is getting a holistic understanding of the sensitive data in your digital estate.” Not knowing that my tenant is a digital estate, I prefer my definition.
IT systems tend to mature over time. The initial implementation is often rudimentary and requires a lot of manual processing before automation and insight is introduced. In the case of Microsoft 365 compliance, the journey started for Office 365 tenants about five years ago. In that time, we’ve seen major components become available to help companies manage important data stored in Office 365, including:
Some aspects of these components need Office 365 E5 or Microsoft 365 Compliance E5 licenses, but the basics of retention and sensitivity labels, DLP, and the audit log can be used with Office 365 E3.
Tenants that have implemented some or all these technologies in the last few years probably have a lot of labeled material. Perhaps that material is all labeled perfectly, but it’s more likely that some information is overlooked, or mislabeled, or wasn’t considered in the original design. Apart from analyzing the application of labels through events in the Office 365 audit log (a messy process) or the basic Label Activity Explorer, up to now there hasn’t been a way to get a good overview of how a company’s data governance is working.
The data classification dashboard in the Microsoft 365 compliance center gives some useful statistics and insights to help compliance administrators figure out where things are working and where some tweaks are needed. Figure 1 shows the data from my (small) tenant. As always, the larger the tenant, the more data you have and the more useful these kinds of features are.
The sections of the dashboard are:
The overview is available with an Office 365 E3 license. Office 365 E5 or the Microsoft 365 E5 compliance licenses are needed for content explorer, activity explorer, and trainable classifiers.
The value of the content explorer is that it exposes the usefulness and accuracy of labeling within a tenant. Clearly, there’s no point in defining sets of retention and sensitivity labels if they are not used. And when labels are used, you’d like to know that they are being used correctly to mark documents and email to be kept, removed, or protected. No one can doubt the goodness of a tool to help compliance administrators improve the effectiveness of data governance.
What some might choke on is that to improve label effectiveness, compliance administrators can view email and documents in the source locations if their account is assigned the right permissions. To use the content explorer, administrators need these permissions:
For example, in Figure 2 the sensitivity labels defined in the tenant are shown in the left-hand pane. The Confidential label is selected, and we’ve selected SharePoint Online as the location, so the content explorer shows the sites where the Confidential label is used.
Selecting a site reveals the set of documents with the assigned label. If your account has the Content Explorer Content Viewer permission, you can then view the document source (Figure 3).
Interestingly, even though SharePoint Online support for sensitivity labels is generally available since March 2020, the source view doesn’t work for documents assigned labels with protection. Apparently, this is by design to stop very sensitive documents being perused by people who shouldn’t be looking at them.
For compliance administrators, content explorer is a great step forward. Being able to open and examine the source of a document or email assigned a retention or sensitivity label or one marked as containing a sensitive data type is an excellent way to confirm the accuracy of user or automatic labeling.
However, some will be nervous when they read that compliance administrators can access information like this, including to protected content. This ignores the simple fact that similar access is already available through content searches or eDiscovery cases. Nonetheless, people do worry about access to private information, so comprehensive oversight is needed before assigning anyone the content explorer permissions.