The European Union General Data Protection Regulation (GDPR) comes into force today and we move from preparation to reality. Maybe now the flood of email asking for consent to remain on mailing lists will abate and we won’t see quite so many people trying to make hay from GDPR FUD. It’s not quite as bad when The Irish Times reported that “an army of advisors, some of them chancers, have fanned out in recent months to make GDPR the most profitable cash cow/scare story since the millennium bug,” but it has come close.
In any case, organizations must now cope with the requirements set down in GDPR, which means that practical interpretations of what needs to be done with IT systems are the order of the day. Lots of preparatory work has no doubt been done; now it’s game time.
Two practical issues that Office 365 tenants might be asked to deal with soon are Data Subject Requests and Data Erasure Requests, defined under Articles 15 and 17 respectively. Office 365 has an off-the-shelf (partial) answer for one; how to handle the other is not as obvious.
The release of support for GDPR Data Subject Request (DSR) cases in the Security and Compliance Center is a welcome step to help Office 365 tenants cope with the new regulations. However, discovering what personal information exists in Exchange, SharePoint, OneDrive, and Teams in response to a request to know what a data controller (an Office 365 tenant) holds about a data subject (a person) is only the start of the journey.
A DSR is a modified form of a standard Office 365 eDiscovery case. The search results returned by the DSR criteria are deliberately broad to uncover everything a tenant holds about someone. For example, searching by someone’s name will find information, but that doesn’t mean that the search results are relevant to the data subject, especially if their name is common. The information found in a scan Office 365 probably includes many messages and files that don’t match the request, which means that some careful checking is necessary before anything is handed over.
The natural progression from searching to respond to an article 15 right of access request is when a data subject exercises their article 17 right to erasure. In other words, someone asks an organization to remove any personal information held about them without undue delay.
GDPR sets out several grounds to justify removal, including that personal data are no longer necessary for the purpose they were collected, the data subject withdrawing consent, or the data subject objects to how the controller processes their data.
For example, an ex-employee might ask their employer to remove all personal information held about them. This includes information like their personal email address and phone number, their national identification number, passport number, and other items of personal data that are not ordinarily in the public domain.
However, the data controller can argue that some information must be retained to comply with a legal obligation (article 17-3b) or to assist with legal claims (article 17-3e). For instance, an employer might need to keep tax records for an ex-employee for several years to comply with national tax regulations.
Deciding what personal data should be removed in response to right to erasure requests is an imprecise science at present. We probably need some guidance from the courts to establish exact boundaries for the data that must be removed and that which can be kept.
Office 365 is only one of the repositories where personal data lives within an organization, but given the pervasive nature of email for communications, and Word and Excel for documenting and organizing HR data, it’s likely that a lot of personal data exists within mailboxes and sites. Any request to erase requests that arrive into an organization using Office 365 means that searches are needed across:
Any personal data of interest in Teams conversations should be picked up in the compliance records captured for Teams in user and group mailboxes.
Office 365 DSRs give a good start to solving the erasure dilemma because the output from searches show where personal data for the data subject might exist. Yammer is the outlier here because Yammer content is not scanned by Office 365 content searches, so searches and exports of Yammer data must be processed separately. On the upside, given how Yammer is generally used, it’s unlikely that much personal data exists in Yammer groups.
When you export the results of content searches, Office 365 generates manifests to show where the exported data originates. As noted above, it’s a mistake to assume that everything uncovered by a DSR case is relevant to a data subject, and manual checking is absolutely needed before any deletions occur. The export manifests are invaluable here because they tell those responsible for processing the request for erasure where to look.
Unfortunately, checking search results is a manual process. Before you delete messages or documents, you need to be sure that those items are relevant to the data subject and do not need to be kept for justifiable business reasons. For example, a check of a document might look for instances of the data subject’s name together with other indications that the document should be removed, such as it includes the data subject’s Social Security Number or passport number.
For this reason, the content searches used to find matches should use precise identifiers whenever possible. A DSR case can span several cases, so you can have one based on the data subject’s name and email address, and another for matches against their passport number, employee number, home address or a similarly unique identifier. You can export the combined results of all searches in a single operation.
In many cases, the requirement for erasure can be satisfied through redaction, or editing to erase the data subject’s details from documents, spreadsheets, and other files. You cannot edit the body of an email, so these probably need to be removed. One complication that exists here is that some content might be protected by Azure Information Protection rights management. In this instance, protected files must be decrypted by an IRM super-user before they can be redacted.
Document processing is complicated by the fact that SharePoint stores multiple versions of a file, meaning that although you might redact the text relating to a data subject in the current version of a document, other versions still exist that might include the information. To get around the problem, you can save a copy of the document, remove the original document, and make the change to the copy before uploading it (as version 1) to SharePoint.
Information in inactive mailboxes is indexed and discoverable, so content searches will pick up any references that exist in these mailboxes. To remove items, you’ll have to restore or recover the inactive mailboxes before you can access the content with clients like Outlook or OWA.
Some items cannot be deleted from Office 365 because they are subject to a preservation lock, a special form of retention policy designed to keep information for a predetermined period that cannot be interfered with. Office 365 will keep these items until the lock expires.
The bottom line is that responding to a request for erasure of Office 365 data under GDPR article 17 is unlikely to be an automatic or inexpensive process. Some simple cases might be processed by doing a search and then using something like the Search-Mailbox cmdlet to permanently remove items from mailboxes. However, the increasingly integrated nature of Office 365 means that those responsible for handling these cases can expect to do a lot of manual work to be sure that the organization responds as GDPR expects.
We don’t know yet whether Microsoft will develop DSRs further to include processing to handle requests for erasure, or the article 18 right of restriction of processing, where a data subject contests the accuracy of their personal data held in a system like Office 365. In all cases, as noted above, depending on automatic processing without checking is not a good idea because the chance that you’ll erase something important is high. Maybe this is a case when artificial intelligence can help. Time will tell.
Follow Tony on Twitter @12Knocksinna.
Want to know more about how to manage Office 365? Find what you need to know in “Office 365 for IT Pros”, the most comprehensive eBook covering all aspects of Office 365. Available in PDF and EPUB formats (suitable for iBooks) or for Amazon Kindle.