An earlier version of this article appeared on Legaltech News
By John Patzakis
A core requirement of both the GDPR and the similar California Consumer Privacy Act (CCPA), which becomes enforceable on July 1, is the ability to demonstrate and prove that personal data is being protected. This requires information governance capabilities that allow companies to efficiently identify and remediate personal data of EU and California residents. For instance, the UK Information Commissioner’s Office (ICO) provides that “The GDPR places a high expectation on you to provide information in response to a SAR (Subject Access Request). Whilst it may be challenging, you should make extensive efforts to find and retrieve the requested information.”
However, recent Gartner research notes that approximately 80% of information stored by companies is “dark data” that is in the form of unstructured, distributed data that can pose significant legal and operational risks. With much of the global workforce now working remotely, this is of special concern and nearly all the company data maintained and utilized by remote employees is in the form of unstructured data. Unstructured enterprise data generally refers to searchable data such as emails, spreadsheets and documents on laptops, file servers, and social media.
An organization’s GDPR compliance efforts need to address any personal data contained within unstructured electronic data throughout the enterprise, as well as the structured data found in CRM, ERP and various centralized records management systems. Personal data is defined in the GDPR as: “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”
Under the GDPR, there is no distinction between structured versus unstructured electronic data in terms of the regulation’s scope. There is a separate guidance regarding “structured” paper records (more on that below). The key consideration is whether a data controller or processor has control over personal data, regardless of where it is located in the organization. Nonetheless, there is some confusion about the scope of the GDPR’s coverage across structured as well as unstructured electronic data systems.
The UK ICO is a key government regulator that interprets and enforces the GDPR, and has recently issued important draft guidance on the scope of GDPR data subject access rights, including as it relates to unstructured electronic information. Notably, the ICO notes that large data sets, including data analytics outputs and unstructured data volumes, “could make it more difficult for you to meet your obligations under the right of access. However, these are not classed as exemptions, and are not excuses for you to disregard those obligations.”
Additionally the ICO guidance advises that “emails stored on your computer are a form of electronic record to which the general principles (under the GDPR) apply.” In fact, the ICO notes that home computers and personal email accounts of employees are subject to GDPR if they contain personal data originating from the employers networks or processing activities. This is especially notable under the new normal of social distancing, where much of a company’s data (and associated personal information) is being stored on remote employee laptops.
The ICO also provides guidance on several related subjects that shed light on its stance regarding unstructured data:
Archived Data: According to the ICO, data stored in electronic archives is generally subject to the GDPR, noting that there is no “technology exemption” from the right of access. Enterprises “should have procedures in place to find and retrieve personal data that has been electronically archived or backed up.” Further, enterprises “should use the same effort to find information to respond to a SAR as you would to find archived or backed-up data for your own purposes.”
Deleted Data: The ICO’s view on deleted data is that it is generally within the scope of GDPR compliance, provided that there is no intent to, or a systematic ability to readily recover that data. The ICO says it “will not seek to take enforcement action against an organisation that has failed to use extreme measures to recreate previously ‘deleted’ personal data held in electronic form. We do not require organisations to use time and effort reconstituting information that they have deleted as part of their general records management.”
However, under this guidance organizations that invest in and deploy re-purposed computer forensic tools that feature automated un-delete capabilities may be held to a higher standard. Deploying such systems can reflect intent to as well as having the systematic technical ability to recover deleted data.
Paper Records: Paper records that are part of a “structured filing system” are subject to the GDPR. Specifically, if an enterprise holds “information about the requester in non-electronic form (e.g. in paper files or on microfiche records)” then such hard-copy records are considered personal data accessible via the right of access,” if such records are “held in a ‘filing system.” This segment of the guidance reflects that references to “unstructured data” in European parlance usually pertains to paper records. The ICO notes in separate guidance that “the manual processing of unstructured personal data, such as unfiled handwritten notes on paper” are outside the scope of GDPR.
GDPR Article 4 defines a “filing system” as meaning “any structured set of personal data which are accessible according to specific criteria, whether centralized, decentralized or dispersed on a functional or geographical basis.” The only form of “unstructured data” that would not be subject to GDPR would be unfiled paper records like handwritten notes or legacy microfiche.
The California Attorney General (AG) released a second and presumably final round of draft regulations under the California Consumer Privacy Act (CCPA) that reflect how unstructured electronic data will be treated under the Act. The proposed rules outline how the California AG is interpreting and will be enforcing the CCPA. Under § 999.313(d)(2), data from archived or backup systems are—unlike the GDPR—exempt from the CCPA’s scope, unless those archives are restored and become active. Additional guidance from the Attorney General states: “Allowing businesses to delete the consumer’s personal information on archived or backup systems at the time that they are accessed or used balances the interests of consumers with the potentially burdensome costs of deleting information from backup systems that may never be utilized.”
What is very notable is that the only technical exception to the CCPA is unrestored archived and back-up data. Like the GDPR, there is no distinction between unstructured and structured electronic data. In the first round of public comments, an insurance industry lobbying group argued that unstructured data be exempted from the CCPA. As reflected by revised guidance, that suggestion was rejected by the California AG.
For the GDPR, the UK ICO correctly advises that enterprises “should ensure that your information management systems are well-designed and maintained, so you can efficiently locate and extract information requested by the data subjects whose personal data you process and redact third party data where it is deemed necessary.” This is why Forrester Research notes that “Data Discovery and Classification are the foundation for GDPR compliance.”
Establish and Enforce Data Privacy Policies
So to achieve GDPR and CCPA compliance, organizations must first ensure that explicit policies and procedures are in place for handling personal information. Once established, it is important to demonstrate to regulators that such policies and procedures are being followed and operationally enforced. A key first step is to establish a data map of where and how personal data is stored in the enterprise. This exercise is actually required under the GDPR Article 30 documentation provisions.
An operational data audit and discovery capability across unstructured data sources allows enterprises to efficiently map, identify, and remediate personal information in order to respond to regulators and data subject access requests from EU and California citizens. This capability must be able to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of weeks or months as is the case with traditional crawling tools. This includes laptops of employees working from home.
These processes and capabilities are not only required for data privacy compliance but are also needed for broader information governance and security requirements, anti-fraud compliance, and e-discovery.
Implementing these measures proactively, with routine and consistent enforcement using solutions such as X1 Distributed GRC, will go a long way to mitigate risk, respond efficiently to data subject access requests, and improve overall operational effectiveness through such overall information governance improvements.