Category Archives: GDPR

How the Remote Workforce Impacts GDPR and CCPA Compliance

By John Patzakis

While our personal and business lives will hopefully return to normal soon, COVID-19 is only accelerating the trend of an increasingly remote and distributed workforce. This “new normal” will necessitate relying on the latest technology and updated workflows to comply with legal, privacy, and information governance requirements, including the GDPR and similar US-based laws.

A core requirement of both the GDPR and the similar California Consumer Privacy Act is the ability to demonstrate and prove that personal data is being protected, thus requiring information governance capabilities that allow companies to efficiently identify and remediate personal data of EU and California residents. For instance, the UK Information Commissioners Office (ICO) provides that “The GDPR places a high expectation on you to provide information in response to a SAR (Subject Access Request). Whilst it may be challenging, you should make extensive efforts to find and retrieve the requested information.”[1]CCPA Image

Under the GDPR, there is no distinction between structured versus unstructured electronic data in terms of the regulation’s scope. The key consideration is whether a data controller or processor has control over personal data, regardless of where it is located in the organization.

The UK ICO, a key government regulator that interprets and enforces the GDPR, recently issued important draft guidance on the scope of GDPR data subject access rights, including as it relates to unstructured electronic information. Notably, the ICO notes that “emails stored on your computer are a form of electronic record to which the general principles (under the GDPR) apply.” In fact, the ICO notes that home computers and personal email accounts of employees are subject to GDPR if they contain personal data originating from the employers networks or processing activities.[2]

CCPA          

The California Attorney General released second and presumably final round draft regulations under the California Consumer Privacy Act (CCPA) that reflect how unstructured electronic data will be treated under the Act.[3] The proposed rules outline how the California AG is interpreting and will be enforcing the CCPA. Under § 999.313(d)(2) data from archived or backup systems are —unlike the GDPR— exempt from the CCPA’s scope, unless those archives are restored and become active: “A business shall comply with a consumer’s request to delete their personal information by: a. Permanently and completely erasing the personal information on its existing systems with the exception of archived or back-up systems.”

What is very notable is that the only technical exception to the CCPA is unrestored archived and back-up data. Like the GDPR, there is no distinction between unstructured and structured electronic data. The CCPA guidance broadly provides that companies must permanently delete personal information from their “existing systems.” In the first round of public comments, an insurance industry lobbying group argued that unstructured data be exempted from the CCPA. As reflected by revised guidance, that suggestion was rejected by the California Attorney General.

Further to this point, AMLaw 100 firm Davis Wright Tremaine provides public guidance on the CCPA as follows: “Access requests may be easier for companies that maintain databases, but most companies also collect unstructured data (such as emails, images, files, etc.) related to consumers. Given that ‘personal information’ includes any information capable of being associated with a consumer or a household, requests will encompass a wide range of data that a business possesses.”[4]

So to achieve GDPR and CCPA compliance, organizations must ensure not only that explicit policies and procedures are in place for handling personal information, but also the ability to prove that those policies and procedures are being followed and operationally enforced. The new normal of remote workforces is a critical challenge that must be addressed.

What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, including on laptops and other unstructured data maintained by remote workforces, through the ability to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of days or weeks. The need for such an operational capability provided by best practices technology is further heightened by the urgency of CCPA and GDPR compliance.

Solving this collection challenge is X1 Distributed Discovery, which is specially designed to address the challenges presented by remote and distributed workforces.  X1 Distributed Discovery (X1DD) enables enterprises to quickly and easily search across up to thousands of distributed endpoints and data servers from a central location.  Legal and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, and full results with completed collection in hours, instead of days or weeks.

To learn more about this capability purpose-built for remote eDiscovery collection and data audits, please contact us.

NOTES:

[1] https://ico.org.uk/media/about-the-ico/consultations/2616442/right-of-access-draft-consultation-20191204.pdf

[2] Id.

[3] https://oag.ca.gov/sites/all/files/agweb/pdfs/privacy/ccpa-text-of-second-set-clean-031120.pdf?

[4] https://www.dwt.com/blogs/privacy–security-law-blog/2019/07/consumer-rights-under-to-ccpa-part-1-what-are-they

Leave a comment

Filed under Best Practices, CaCPA, compliance, Data Audit, GDPR, Uncategorized

CaCPA Compliance Requires Effective Investigation and eDiscovery Capabilities

By John Patzakis

The California Consumer Protection Act, (CaCPA ), which will be in full force on January 1, 2020,  promises to profoundly impact major US and global organizations, requiring the overhaul of their data audit, investigation and information governance processes. The CaCPA requires that an organization have absolute knowledge of where all personal data of California residents is stored across the enterprise, and be able to remove it when required. Many organization with a global reach will be under obligations to comply with both the GDPR and CaCPA, providing ample requirement justification to bolster their compliance efforts.

CCPA Image

According to data security and privacy attorney Patrick Burke, who was recently a senior New York State Financial Regular overseeing cybersecurity compliance before heading up the data privacy law practice at Phillips Nizer, CaCPA compliance effectively requires a robust digital investigation capability. Burke, speaking in a webinar earlier this month, noted that under the “CaCPA, California residents can request that all data an enterprise holds on them be identified and also be removed. Organizations will be required to establish a capability to respond to such requests. Actual demonstrated compliance will require the ability to search across all data sources in the enterprise for data, including distributed unstructured data located on desktops and file servers.” Burke further noted that organizations must be prepared to produce “electronic evidence to the California AG, which must determine whether there was a violation of CaCPA…as well as evidence of non-violation (for private rights of action) and of a ‘cure’ to the violation.”

The CaCPA contains similar provisions as the GDPR, which both specify processes and capabilities organizations must have in place to ensure the personal data of EU and California residents is secure, accessible, and can be identified upon request. These common requirements, enumerated below, can only be complied with through an effective enterprise eDiscovery search capability:

  • Data minimization: Under both the CaCPA and the GDPR, enterprises should only collect and retain as little personal data on California residents EU subjects as possible. As an example, Patrick Burke, who routinely advises his legal clients on these regulations, notes that unauthorized “data stashes” maintained by employees on their distributed unstructured data sources is a key problem, requiring companies to search all endpoints to identify information including European phone numbers, European email address domains and other personal identifiable information.
  • Enforcement of right to be forgotten: An individual’s personal data must be identified and deleted on request.
  • Effective incident response: If there is a compromise of personal data, an organization must have the ability to perform enterprise-wide data searches to determine and report on the extent of such breaches and resulting data compromise within seventy-two (72) hours under the GDPR. There are less stringent, but similar CaCPA requirements.
  • Accountability: Log and provide audit trails for all personal data identification requests and remedial actions.
  • Enterprise-wide data audit: Identify the presence of personal data in all data locations and delete unneeded copies of personal data.

Overall, a core requirement of both CaCPA and GDPR compliance is the ability to demonstrate and prove that personal data is being protected, requiring information governance capabilities that allow companies to efficiently produce the documentation and other information necessary to respond to auditors’ requests. Many consultants and other advisors are helping companies establish privacy compliance programs, and are documenting policies and procedures that are being put in place.

However, while policies, procedures and documentation are important, such compliance programs are ultimately hollow without consistent, operational execution and enforcement. CIOs and legal and compliance executives often aspire to implement information governance programs like defensible deletion and data audits to detect risks and remediate non-compliance. However, without an actual and scalable technology platform to effectuate these goals, those aspirations remain just that. For instance, recent IDG research suggests that approximately 70% of information stored by companies is “dark data” that is in the form of unstructured, distributed data that can pose significant legal and operational risks.

To achieve GDPR and CaCPA compliance, organizations must ensure that explicit policies and procedures are in place for handling personal information, and just as important, the ability to prove that those policies and procedures are being followed and operationally enforced. What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, through the ability to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of days or weeks. The need for such an operational capability provided by best practices technology is further heightened by the urgency of CaCPA and GDPR compliance.

A link to the recording of the recent webinar “Effective Incident Response Under GDPR and CaCPA”, is available here.

 

Leave a comment

Filed under CaCPA, compliance, Data Audit, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, GDPR, Records Management, Uncategorized

Incident Reporting Requirements Under GDPR and CCPA Require Effective Incident Response

By John Patzakis

The European General Data Protection Regulation (GDPR) is now in effect, but many organizations have not fully implemented compliance programs. For many organizations, one of the top challenges is complying with the GDPR’s tight 72-hour data breach notification window. Under GDPR article 33, breach notification is mandatory where a data breach is likely to “result in a risk for the rights and freedoms of individuals.” This must be done within 72 hours of first having become aware of the breach.  Data processors will also be required to notify their customers, the controllers, “without undue delay” after first becoming aware of a data breach.GDPR-stamp

In order to comply, organizations must accelerate their incident response times to quickly detect and identify a breach within their networks, systems, or applications, and must also improve their overall privacy and security processes. Being able to follow the GDPR’s mandate for data breach reporting is equally important as being able to act quickly when the breach hits. Proper incident response planning and practice are essential for any privacy and security team, but the GDPR’s harsh penalties amplify the need to be prepared.

It is important, however, to note that the GDPR does not mandate reporting for every network security breach. It only requires reporting for breaches impacting the “personal data” of EU subjects. And Article 33 specifically notes that reporting is not required where “the personal data breach is unlikely to result in a risk to the rights and freedoms of natural persons.”

The California Consumer Privacy Act contains similar provisions. Notification is only required if a California resident’s data is actually compromised.

So after a network breach is identified, determining whether the personal data of an EU or California citizen was actually compromised is critical not only to comply where a breach actually occurred, but also limit unnecessary or over reporting where an effective response analysis can rule out an actual personal data breach.

These breaches are perpetrated by outside hackers, as well as insiders. An insider is any individual who has authorized access to corporate networks, systems or data.  This may include employees, contractors, or others with permission to access an organizations’ systems. With the increased volume of data and the increased sophistication and determination of attackers looking to exploit unwitting insiders or recruit malicious insiders, businesses are more susceptible to insider threats than ever before.

Much of the evidence of the scope of computer security incidents and whether subject personal data was actually compromised are not found in firewall logs and typically cannot be flagged or blocked by intrusion detection or intrusion prevention systems. Instead, much of that information is found in the emails and locally stored documents of end users spread throughout the enterprise on file servers and laptops. To detect, identify and effectively report on data breaches, organizations need to be able to search across this data in an effective and scalable manner. Additionally, proactive search efforts can identify potential security violations such as misplaced sensitive IP, or personal customer data or even password “cheat sheets” stored in local documents.

To date, organizations have employed limited technical approaches to try and identify unstructured distributed data stored across the enterprise, enduring many struggles. For instance, forensic software agent-based crawling methods are commonly attempted but cause repeated high computer resource utilization for each search initiated and network bandwidth limitations are being pushed to the limits rendering this approach ineffective, and preventing any compliance within tight reporting deadlines. So being able to search and audit across at least several hundred distributed end points in a repeatable and expedient fashion is effectively impossible under this approach.

What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, through the ability to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of days or weeks. None of the traditional approaches come close to meeting this requirement. This requirement, however, can be met by the latest innovations in enterprise eDiscovery software.

X1 Distributed GRC  represents a unique approach, by enabling enterprises to quickly and easily search across multiple distributed endpoints from a central location.  Legal, cybersecurity, and compliance teams can easily perform unified complex searches across both unstructured content and metadata, and obtain statistical insight into the data in minutes, instead of days or weeks. With X1 Distributed GRC, organizations can proactively or reactively search for confidential data leakage and also keyword signatures of personal data breach attacks, such as customized spear phishing attacks. X1 is the first product to offer true and massively scalable distributed searching that is executed in its entirety on the end-node computers for data audits across an organization. This game-changing capability vastly reduces costs and quickens response times while greatly mitigating risk and disruption to operations.

Leave a comment

Filed under compliance, Corporations, Cyber security, Cybersecurity, Data Audit, GDPR, Information Governance

In-Place Data Analytics For Unstructured Data is No Longer Science Fiction

By John Patzakis

AI-driven analytics supercharges compliance investigations, data security, privacy audits and eDiscovery document review.  AI machine learning employs mathematical models to assess enormous datasets and “learn” from feedback and exposure to gain deep insights into key information. This enables the identification of discrete and hidden patterns in millions of emails and other electronic files to categorize and cluster documents by concepts, content, or topic. This process goes beyond keyword searching to identify anomalies, internal threats, or other indicators of relevant behavior. The enormous volume and scope of corporate data being generated has created numerous opportunities for investigators seeking deep information insights in support of internal compliance, civil litigation and regulatory matters.

The most effective use of AI in investigations couple continuous active learning technology with concept clustering to discover the most relevant data in documents, emails, text and other sources.  As AI continues to learn and improve over time, the benefits of an effectively implemented approach will also increase. In-house and outside counsel and compliance teams are now relying on AI technology in response to government investigations, but also increasingly to identify risks before they escalate to that stage.

Stock Photo - Digital Image used in blog

However, logistical and cost barriers have traditionally stymied organizations from taking advantage of AI in a systematic and proactive basis, especially regarding unstructured data, which, according to industry studies, constitutes 80 percent or more of all data (and data risk) in the enterprise. As analytics engines ingest the text from documents and emails, the extracted text must be “mined” from their native originals. And the natives must first be collected and migrated to a centralized processing appliance. This arduous process is expensive and time consuming, particularly in the case of unstructured data, which must be collected from the “wild” and then migrated to a central location, creating a stand-alone “data lake.”

Due to these limitations, otherwise effective AI capabilities are utilized typically only on very large matters on a reactive basis that limits its benefits to the investigation at hand and the information within the captive data lake.  Thus, ongoing active learning is not generally applied across multiple matters or utilized proactively. And because that captive information consists of migrated copies of the originals, there is a very limited ability to act on data insights as the original data remains in its actual location in the enterprise.

So the ideal architecture for the enterprise would be to move the data analytics “upstream” where all the unstructured data resides, which would not only save up to millions per year in investigation, data audit and eDiscovery costs, but would enable proactive utilization for compliance auditing, security and policy breaches and internal fraud detection.  However, analytics engines require considerable computing resources, with the leading AI solutions typically necessitating tens of thousands of dollars’ worth of high end hardware for a single server instance. So these computing workloads simply cannot be forward deployed to laptops and multiple file servers, where the bulk of unstructured data and associated enterprise risk exists.

But an alternative architecture solves this problem. A process that extracts text from unstructured, distributed data in place, and systematically sends that data at a massive scale to the analytics platform, with the associated metadata and global unique identifiers for each item.  As mentioned, one of the many challenges with traditional workflows is the massive data transfer associated with ongoing data migration of electronic files and emails, the latter of which must be sent in whole containers such as PST files. This process alone can take weeks, choke network bandwidth and is highly disruptive to operations. However, the load associated with text/metadata only is less than 1 percent of the full native item. So the possibilities here are very compelling. This architecture enables very scalable and proactive compliance, information security, and information governance use cases. The upload to AI engines would take hours instead of weeks, enabling continual machine learning to improve processes and accuracy over time and enable immediate action to taken on identified threats or otherwise relevant information.

The only solution that we are aware of that fulfills this vision is X1 Distributed GRC. X1’s unique distributed architecture upends the traditional collection process by indexing at the distributed endpoints, enabling direct pipeline of extracted text to the analytics platform. This innovative technology and workflow results in far faster and more precise collections and a more informed strategy in any matter.

Deployed at each end point or centrally in virtualized environments, X1 Enterprise allows practitioners to query many thousands of devices simultaneously, utilize analytics before collecting and process while collecting directly into myriad different review and analytics applications like RelativityOne and Brainspace. X1 Enterprise empowers corporate eDiscovery, compliance, investigative, cybersecurity and privacy staff with the ability to find, analyze, collect and/or delete virtually any piece of unstructured user data wherever it resides instantly and iteratively, all in a legally defensible fashion.

X1 displayed these powerful capabilities with ComplianceDS in a recent webinar with a brief but substantive demo of our X1 Distributed GRC solution, emphasizing our innovative support of analytics engines through our game-changing ability to extract text in place with direct feed into AI solutions.

Here is a link to the recording with a direct link to the 5 minute demo portion.

Leave a comment

Filed under Best Practices, collection, compliance, Corporations, eDiscovery & Compliance, Enterprise eDiscovery, Enterprise Search, GDPR, Uncategorized

GDPR Fines Issued for Failure to Essentially Perform Enterprise eDiscovery

By John Patzakis

The European General Data Protection Regulation (GDPR) came into full force in May 2018. Prior to that date, what I consistently heard from most of the compliance community was general fear and doubt about massive fines, with the solution being to re-purpose existing compliance templates and web-based dashboards. However, many organizations have learned the hard way that “paper programs” alone fall far short of the requirements under the GDPR. This is because the GDPR requires that an organization have absolute knowledge of where all EU personal data is stored across the enterprise, and be able to search for, identify and remove it when required.GDPR-stamp

Frequent readers of this blog may recall we banged the Subject Access Request drum prior to May 2018. We noted an operational enterprise search and eDiscovery was required to effectively comply with many of the core data discovery-focused requirements of GDPR. Under the GDPR, a European resident can request — potentially on a whim — that all data an enterprise holds on them be identified and also be removed. Organizations are required to establish a capability to respond to these Subject Access Requests (SARs). Forrester Research notes that “Data Discovery and classification are the foundation of GDPR compliance.” This is because, according to Forrester, GDPR effectively requires that an organization be able to identify and actually locate, with precision, personal data of EU data subjects across the organization.

Failure to respond to SARs has already led to fines and enforcement actions against several companies, including Google and the successor entity to Cambridge Analytica. This shows that many organizations are failing to understand the operational reality of GDPR compliance. This point is effectively articulated by a recent practice update from the law firm of DLA Piper on the GDPR, which states: “The scale of fines and risk of follow-on private claims under GDPR means that actual compliance is a must. GDPR is not a legal and compliance challenge – it is much broader than that, requiring organizations to completely transform the way that they collect, process, securely store, share and securely wipe personal data (emphasis added).”

These GDPR requirements can only be complied with through an effective enterprise eDiscovery search capability:

To achieve GDPR compliance, organizations must ensure that explicit policies and procedures are in place for handling personal information, and just as importantly, the ability to prove that those policies and procedures are being followed and operationally enforced. What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, through the ability to search and report across several thousand endpoints and other unstructured data sources, and returning results within minutes instead of days or weeks. The need for such an operational capability is further heightened by the urgency of GDPR compliance.

X1 Distributed GRC represents a unique approach, by enabling enterprises to quickly and easily search across multiple distributed endpoints and data servers from a central location.  Legal and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, instead of days or weeks. With X1, organizations can also automatically migrate, collect, delete, or take other action on the data as a result of the search parameters.  Built on our award-winning and patented X1 Search technology, X1 Distributed GRC is the first product to offer true and massively scalable distributed searching that is executed in its entirety on the end-node computers for data audits across an organization. This game-changing capability vastly reduces costs while effectuating that all-too-elusive actual compliance with information governance programs, including GDPR.

1 Comment

Filed under Best Practices, compliance, Data Audit, GDPR, Uncategorized