Data Loss Prevention (DLP) is a technology that is designed to protect against the misuse or accident disclosure of sensitive data contained in electronic files such as email and documents. The classic form of sensitive data is Personally Identifiable Information (PII) such as social security numbers, tax identification numbers, passport numbers, and driving license numbers. Credit and debit card numbers are another form of sensitive data but the definition of what is sensitive data differs across countries, industries, and even individual companies. One person’s sensitive data is another person’s rubbish.
Microsoft laid down the basic principles for DLP when it designed the implementation for Exchange 2013. DLP policies specify the kind of sensitive data that needs protection and rules that govern what happens upon the detection of sensitive data. The methods used to detect sensitive data in content use a mixture of algorithms, context, and confidence. For instance, DLP can validate a 16-digit credit card number using Luhn’s algorithm and confirm that it is a credit card number through the existence of other evidence such as a keyword (like “MasterCard”) and an expiry date.
In addition, clients incorporate methods to assist people to understand how to deal with sensitive data within email and attachments. An analysis engine built into Outlook clients can detect sensitive data described by policy. Potential violations can then be signalled to users through policy tips, who can override policy if permitted. Finally, pre-packaged templates allow customers to build rules for policies very quickly based on business need, such as the requirement for a company to satisfy the U.S. Patriot Act or to protect HIPAA data.
It was logical for Microsoft to start the DLP journey with Exchange as the transport service offers a single chokepoint through which all messages must pass. DLP checking use a special form of Exchange transport rules (ETR) that look for DLP policy violations in messages and their attachments. The server pushes policy information to Outlook 2013 and Outlook 2016 clients as XML manifests to allow offline checking of messages. Although it has some offline capabilities, OWA can only perform DLP checking online. Add in the ability for companies to define their own DLP sensitive data types through a digital fingerprinting process and Exchange offers some nice DLP capabilities in Exchange 2013, Exchange 2016, and Exchange Online.
Things are different in the world of SharePoint. DLP was first implemented in SharePoint Online (including OneDrive for Business) followed by SharePoint 2016. No central place exists where data must pass before users can access it. Clients are different and interact with information in a different manner. The differences did not stop Microsoft delivering DLP for SharePoint and OneDrive. Instead, DLP checking for documents occurs in an optimized manner for the kind of processing that exists on the platform, DLP policies exist and much the same sensitive data types exist and templates are available to address the same kind of business needs. Thresholds for data occurrence define when rules fire and the rules contain the steps to handle violations. Along with browser access, Outlook’s companion products in the Office desktop suite (Word, PowerPoint, and Excel) provide the basic DLP clients for SharePoint. DLP policy tips are also supported in the OneDrive for Business mobile app. It’s all very familiar while different at the same time.
The core of DLP checking for SharePoint and OneDrive happens within the Search Indexing process. As users create or modify documents, they are eventually processed by the search engine in order that it can index the content to allow it to be available to applications like Delve or be searchable by users. When documents are crawled their content is checked against DLP policies. If sensitive data is found that violates a DLP policy, the actions defined in the rules belonging to the policy are executed, the document is marked as a problem, and any notifications defined in the policy occur, including the generation of an incident report that can be sent to a compliance officer.
The time required for SharePoint to detect a new violation depends on the load on SharePoint Online when users add or modify documents. The crawler might get to a new or amended document in a couple of minutes or it might take a few hours. On average, you can expect problems to be picked up in about 15 minutes. Older documents that contain problematic content will not be detected until they are updated or a complete crawl and reindexing of the site where the documents are stored is initiated. This will not be a fast process.
Unified DLP extends the original framework established for SharePoint Online to support Exchange data. Tenants can create and manage unified DLP policies through the Security and Compliance Center. Unified DLP policies cover both Exchange and SharePoint and removes the need to maintain two separate sets of policies. However, it’s important to recognize that Unified DLP policies currently support limited capabilities to process Exchange data when compared to ETR-based policies.
We are in a period when two forms of DLP policies exist within Office 365. The workload-specific versions will continue as is because the on-premises servers need DLP capabilities. Over time, Microsoft intends to close the functionality gap and make Unified DLP policies as capable as their workload-specific counterparts. However, Microsoft cannot commit to when it will be possible to migrate away from ETR-based policies to achieve a single set of policies that process all forms of Office 365 data.
Access to Unified DLP policies is through the Threat Management section of the Security and Compliance Center. The differences that exist between the older SharePoint-only policies and unified DLP policies are:
Much the same steps taken to create a SharePoint-only DLP policy are used to create a unified DLP policy. First, you decide whether you want to create a custom policy, which is where you create a policy without any predefined settings, or by using one of the templates supplied by Microsoft. It is usually more efficient and easier to create policies from templates until you become very familiar with DLP techniques, so we’ll use the “U.S. Financial Data” template (Figure 1) for this discussion.
The next step is to decide what Office 365 services to scan for DLP policy violations. As shown in Figure 2, you can choose Exchange Online, SharePoint Online, and OneDrive for Business. If required, you can nominate specific SharePoint and OneDrive sites for protection, which might be the case if concern exists about the documents belonging to a specific project. DLP policies support the document libraries used by Office 365 Groups.
When you create a new DLP policy from a template, the policy inherits several rules from the template. In the case of the “U.S. Financial data” template, the policy imports two rules (Figure 3), both of which scan for situations where users share sensitive data such as credit cards outside the tenant. Two rules allow for different actions when items include a low count of sensitive data and when they have a higher count.
You can accept the rules as imported from the template or customize them to adjust how they work. For example, DLP invokes the “high count” rule when 10 or more occurrences of the specified sensitive data types occur in content shared with people outside the tenant. You might like to reduce the number because it seems more sensible to flag a potential violation if a user includes more than five credit card numbers in a document or message. You can also update a rule to add conditions, specify whom will receive incident reports, and so on. To edit a rule, click its entry in the list.
A rule imported from a template cannot exist in multiple policies. Therefore, if you attempt to import the same rule into multiple policies, you will not be able to save the rule and will see an error message when you attempt to save the policy. If you need to use the same rule in multiple policies, you must create the rule from scratch in the second policy and manually copy the actions, conditions, and exceptions over from the original rule.
Figure 4 shows an edit for the actions associated with a rule. In this case, we are editing the notification options that control how users (and administrators) learn about problems due to sensitive data violations in their content. The options also control whether a user can override a policy prompt by providing a business reason to explain why the user is dealing with sensitive data in such a manner.
You can enable a new DLP policy for testing or activate it to begin protecting content straightaway. In most cases, it is good to enable testing first to check that DLP picks up and signals expected violations to users. If things work properly, users who create problematic content will start to receive notifications of the type shown in Figure 5.
They will also see policy tips in the libraries where they store the problem items (Figure 6).
All the above should come as no surprise to anyone who has worked with DLP for SharePoint in the past. The only real difference is the integration of Exchange Online as a protectable service. Unified DLP policies are not as powerful as the Exchange variety. Today, unified DLP policies can detect the misuse of sensitive data and the transmission of that data outside the organization, but that is not quite the same as handling the wide range of conditions and exceptions supported by transport rules. The upside of using unified DLP policies is that you maintain a single set. The downside is that you lose some functionality. That might not be a big deal if you use relatively simple policies, but it will be if you explore more complex scenarios. The only way to be sure is to test.
The following factors should be considered when moving away from ETR-based policies to Unified DLP policies.
The current feature gap between the two DLP implementations means that some careful testing is necessary if you want to introduce unified DLP rules into a tenant where Exchange DLP rules are in active use.
Microsoft has already demonstrated that it is possible to create a unified capability that works better across multiple Office 365 workloads than the workload-specific variants. eDiscovery is a great example where the content searches executed through the Security and Compliance Center are faster and more functional than the equivalents found in Exchange Online and SharePoint Online. Microsoft encourages tenants to use Office 365 content searches and they’re taking the same line with unified DLP policies. Development, including a UI refresh that is now available to First Release tenants, will continue to build out the features and capabilities of unified DLP policies. Eventually these policies will be able to replace the workload-specific policies. No one knows when that point will be reached, but the direction is clear.
Follow Tony on Twitter @12Knocksinna.
Want to know more about how to manage Office 365? Find what you need to know in “Office 365 for IT Pros”, the most comprehensive eBook covering all aspects of Office 365. Available in PDF and EPUB formats (suitable for iBooks) or for Amazon Kindle.