Category Archives: compliance

Index and Search In-Place Workflows Are Essential for Information Governance

By John Patzakis and Charles Meier

Information Governance

Accurate pre-collection data insight is a game-changing capability that enables organizations and their legal teams to determine the scope, volume, and content of electronic information before the very disruptive and expensive step of collecting the data. This insight is enabled through distributed index and search in-place technology.

A true distributed index and search in-place capability for unstructured data requires a software-based indexing technology be deployed directly onto fileservers, laptops, or in the cloud to address Microsoft 365 and other cloud-based data sources. This indexing occurs where the data sources reside without requiring a bulk transfer of the data to a central location. Once indexed, searches can be performed in seconds, supporting complex Boolean operators, metadata filters and regular expressions. Searches can be iterated and refined without limitation, which is critical for large data sets.

While our previous blog post addressed the critical importance of this capability in eDiscovery matters, it is equally essential in information governance projects such as PII audits, the purging of redundant, obsolete or trivial (ROT) data, and due diligence and data separation efforts in support of corporate mergers and acquisitions. Many X1 customers have recently employed our indexing in-place technology on such projects with remarkable success.

Incredibly, many of these customers also received alternative proposals that leverage traditional eDiscovery workflows presenting much higher estimated costs and much longer durations. Traditional eDiscovery workflows mandate broad and manual data collection, copying and migration efforts, large scale data processing, and loading the data into a different platform for review and analysis. There are three fundamental reasons why this “traditional approach” is fatally flawed for information governance projects.

  1. Prohibitive Cost and Risk. The data scope of information governance projects involves terabytes and sometimes petabytes of data. Mass collection, copying and migration of these data sets with manual hand-offs for later analysis in a centralized location is extremely expensive, disruptive, and time consuming. Also, mass duplication and egress of enterprise data under control to execute ROT, PII, data separation or other due diligence projects is completely antithetical to their very purpose.
  2. The “Now What?” Problem. Let’s assume an organization has decided to incur the enormous cost, disruption and risk associated with the mass copying, migration, and centralization of unstructured data, and after loading the data into a review process, a key subset of documents and emails are finally identified for purging or other remedial action. Now what? You are merely working with copies! The live “original” emails and documents are in M365, email accounts, file servers or on laptops. It is possible to manually retrace and remediate, but that process is expensive and disruptive.
  3. Instant Staleness. Finally, a mass copying and migration effort often requiring several weeks to complete, is immediately stale once eventually completed as the live data in its original location has inevitably changed.

X1 solves these challenges though our proprietary and patented distributed index and search in-place technology that enables scale by bringing true distributed indexing in-place to laptops, file shares, M365 and other cloud sources. X1 Enterprise Collect significantly streamlines information governance workflows by identifying and allowing for the remediation of targeted data in-place, thereby eliminating the need for expensive and cumbersome data duplication and migration.

For a demonstration of the X1 Enterprise Collect Platform, contact us at sales@x1.com. For more details on this innovative solution, please visit www.x1.com/x1-enterprise-collect-platform.

Leave a comment

Filed under Cloud Data, compliance, Corporations, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, Information Governance, law firm, Preservation & Collection

X1 Delivers Cutting-Edge MS Teams Support

By John Patzakis

The prominence of Microsoft 365 data sources continue to grow in eDiscovery matters exponentially. However, most non-MS eDiscovery tools collect from MS 365 by simply making bulk copies of data associated with individual accounts, and then attempt to transfer that data en masse to their own proprietary processing and/or review platform. Such an effort is very costly, time-consuming, and inefficient for many reasons. For one, this bulk transfer triggers data transfer throttling by Microsoft, causing significant time delays. But the main problem is that clients who are investing in MS 365 do not want to see all their data routinely exported out of its native environment every time there is an eDiscovery or compliance investigation.

So, enterprises with relevant data stored in MS 365 need to have a good process to perform unified and efficient search and collection of MS 365 and non-MS 365 sources. To achieve requisite efficiency and the minimization of data transfer, this process should be based upon a targeted search and collection in-place capability, and not simply involve mass export of data out of MS 365 for downstream processing and searching.

To answer this unmet critical need, X1 launched MS 365 data connectors to our X1 Enterprise Collect platform. X1 Enterprise Collect provides users the unique ability to search and collect MS 365 data in-place. X1’s optimized approach of iterative search and targeted collection enables organizations to apply proportionality principles across both cloud and on-premise data sources with clear and consistent results for effective eDiscovery.

And now, X1 has added cutting-edge Teams support to complete our existing support of OneDrive, MS Mail and SharePoint. The X1 Enterprise Collect Teams collection capabilities include the following unique benefits unmatched by other independent software providers:
• The ability to target individual custodians and specific messaging threads, displacing any need to mass download channels
• Unified search and collection of on-premise and cloud data sources, including Teams, OneDrive, SharePoint, Mail, laptops and file-shares for an optimized approach
• Patented index, search and process the data in-place, removes any reliance on premium processing or supplemental services
• One-click upload into Relativity for review, for a streamlined end-to-end process
• A truly automated product solution, as opposed to a service-based offering

Winston & Strawn eDiscovery partner Bobby Malhotra notes: “With the vast number of users and unyielding amount of data in collaboration applications such as Teams, having the ability to target and triage data by specific custodians and threads allows organizations to handle discovery in an efficient and pragmatic manner. X1 provides the unique ability to seamlessly collect and search across numerous web, collaboration, and social, data sources.”

Watch the X1 hosted webinar on-demand featuring the hot topic of Best Practices to Collect from MS Teams in an Effective, Defensible and Proportional Manner.

The X1 Enterprise Collect Platform is available now from X1 and its global channel network in the cloud, on-premise, and with our services available on-demand. For a demonstration of the X1 Enterprise Collect Platform, contact us at sales@x1.com. For more details on this innovative solution, please visit www.x1.com/x1-enterprise-collect-platform.

Leave a comment

Filed under Authentication, Best Practices, Cloud Data, collection, compliance, Corporations, Data Audit, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, law firm, MS Teams, OneDrive, proportionality, SharePoint

Post Pandemic, Corporate eDiscovery Undergoes a Permanent Paradigm Shift

By John Patzakis

While the pandemic disrupted the workplace during its height, it is now becoming clear that a more permanent transformation has taken place. Employees and their electronic information assets are far more geographically dispersed. This is requiring corporate legal departments to rethink how they conduct eDiscovery, as the old model based upon data over-collection is no longer tenable. Instead, corporations are favoring a more targeted approach to ESI collection.

Industry analyst Greg Buckles of the eDiscovery Journal recently provided a good analysis on this topic:

“The sheer volume of raw custodial collections has put pressure on discovery professionals to use an iterative selective collection strategy. That puts the corporate legal team closer to scoping and collection activities than most have been. For too long corporate legal has felt uncomfortable pushing back on overly burdensome or broad discovery requests from opposing or retained counsel. The recent development of proportionality frameworks, guidelines and tools has the potential to empower corporate legal to make defensible cost-risk arguments.”

Buckles further observes that “some of my clients have drastically cut their eDiscovery related expenses through these kinds of initiatives.” He terms this as a “grand enterprise reboot” that “brings (corporate legal) to the table with a fresh perspective.”

Most core eDiscovery costs (outside of attorney review) stem from over-collection of ESI. While direct collection costs can seem inexpensive, law firm Nelson Mullins notes that “over preservation tends to have its own costs relating to storage of large amounts of electronically stored information (ESI) and the resources needed to manage it; leads to increased downstream e-discovery costs associated with collection, processing, and review.”

As outlined by Buckles, proportionality-based eDiscovery is an important principle that all corporate attorneys should be leveraging. Under Federal Rule of Civil Procedure 26(b)(1), parties may discover any non-privileged material that is relevant to any party’s claim or defense and proportional to the needs of the case. However, attorneys representing enterprises are essentially flying blind on this analysis when it matters most. Prior to the custodian data being actually collected, processed and analyzed, attorneys do not have any real visibility into the potentially relevant ESI across an organization. This is especially true in regard to unstructured, distributed data, which is invariably the majority of ESI that is ultimately collected in a given matter.

If accurate pre-collection data insight were available to counsel, that game-changing factor would enable counsel to set reasonable discovery limits and ultimately process, host, review and produce much less ESI. Counsel can further use pre-collection proportionality analysis to gather key information, develop a litigation budget, and better manage litigation deadlines. Such insights can also foster cooperation by informing the parties early in the process about where relevant ESI is located, and what keywords and other search parameters can identify and pinpoint relevant ESI.

A solution to these challenges is the utilization of index and search in-place technology. Indexing and search in-place in this context means that a software-based indexing technology is deployed directly onto file servers, laptops or even in the cloud to address cloud-based data sources. This indexing occurs without a bulk data transfer of the data. Once indexed, the searches are performed in a few seconds, with complex Boolean operators, metadata filters and regular expression searches. The searches can be iterated and repeated without limitation, which is critical for large data sets.

But it is important that the technology employed truly enables index-in-place, with the indexes deployed directly onto the laptops, file shares or cloud servers where the data exists. Some providers will market their tools as such, but the indexing and searching actually takes place in their platform at a central location. Data must first be copied and collected off of laptops and file servers and migrated over the network to get the indexing engines. This does not scale for eDiscovery. For information about X1’s index-in-place technology, X1 Enterprise Platform, please visit us here.

Leave a comment

Filed under Best Practices, collection, compliance, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, Preservation & Collection, proportionality

CCPA and GDPR UPDATE: Unstructured Enterprise Data in Scope of Compliance Requirements

An earlier version of this article appeared on Legaltech News

By John Patzakis

A core requirement of both the GDPR and the similar California Consumer Privacy Act (CCPA), which becomes enforceable on July 1, is the ability to demonstrate and prove that personal data is being protected. This requires information governance capabilities that allow companies to efficiently identify and remediate personal data of EU and California residents. For instance, the UK Information Commissioner’s Office (ICO) provides that “The GDPR places a high expectation on you to provide information in response to a SAR (Subject Access Request). Whilst it may be challenging, you should make extensive efforts to find and retrieve the requested information.”CCPA GDPR

However, recent Gartner research notes that approximately 80% of information stored by companies is “dark data” that is in the form of unstructured, distributed data that can pose significant legal and operational risks. With much of the global workforce now working remotely, this is of special concern and nearly all the company data maintained and utilized by remote employees is in the form of unstructured data. Unstructured enterprise data generally refers to searchable data such as emails, spreadsheets and documents on laptops, file servers, and social media.

The GDPR

An organization’s GDPR compliance efforts need to address any personal data contained within unstructured electronic data throughout the enterprise, as well as the structured data found in CRM, ERP and various centralized records management systems. Personal data is defined in the GDPR as: “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”

Under the GDPR, there is no distinction between structured versus unstructured electronic data in terms of the regulation’s scope. There is a separate guidance regarding “structured” paper records (more on that below). The key consideration is whether a data controller or processor has control over personal data, regardless of where it is located in the organization. Nonetheless, there is some confusion about the scope of the GDPR’s coverage across structured as well as unstructured electronic data systems.

The UK ICO is a key government regulator that interprets and enforces the GDPR, and has recently issued important draft guidance on the scope of GDPR data subject access rights, including as it relates to unstructured electronic information. Notably, the ICO notes that large data sets, including data analytics outputs and unstructured data volumes, “could make it more difficult for you to meet your obligations under the right of access. However, these are not classed as exemptions, and are not excuses for you to disregard those obligations.”

Additionally the ICO guidance advises that “emails stored on your computer are a form of electronic record to which the general principles (under the GDPR) apply.” In fact, the ICO notes that home computers and personal email accounts of employees are subject to GDPR if they contain personal data originating from the employers networks or processing activities. This is especially notable under the new normal of social distancing, where much of a company’s data (and associated personal information) is being stored on remote employee laptops.

The ICO also provides guidance on several related subjects that shed light on its stance regarding unstructured data:

Archived Data: According to the ICO, data stored in electronic archives is generally subject to the GDPR, noting that there is no “technology exemption” from the right of access. Enterprises “should have procedures in place to find and retrieve personal data that has been electronically archived or backed up.” Further, enterprises “should use the same effort to find information to respond to a SAR as you would to find archived or backed-up data for your own purposes.”

Deleted Data: The ICO’s view on deleted data is that it is generally within the scope of GDPR compliance, provided that there is no intent to, or a systematic ability to readily recover that data. The ICO says it “will not seek to take enforcement action against an organisation that has failed to use extreme measures to recreate previously ‘deleted’ personal data held in electronic form. We do not require organisations to use time and effort reconstituting information that they have deleted as part of their general records management.”

However, under this guidance organizations that invest in and deploy re-purposed computer forensic tools that feature automated un-delete capabilities may be held to a higher standard. Deploying such systems can reflect intent to as well as having the systematic technical ability to recover deleted data.

Paper Records: Paper records that are part of a “structured filing system” are subject to the GDPR. Specifically, if an enterprise holds “information about the requester in non-electronic form (e.g. in paper files or on microfiche records)” then such hard-copy records are considered personal data accessible via the right of access,” if such records are “held in a ‘filing system.” This segment of the guidance reflects that references to “unstructured data” in European parlance usually pertains to paper records. The ICO notes in separate guidance that “the manual processing of unstructured personal data, such as unfiled handwritten notes on paper” are outside the scope of GDPR.

GDPR Article 4 defines a “filing system” as meaning “any structured set of personal data which are accessible according to specific criteria, whether centralized, decentralized or dispersed on a functional or geographical basis.” The only form of “unstructured data” that would not be subject to GDPR would be unfiled paper records like handwritten notes or legacy microfiche.

The CCPA  

The California Attorney General (AG) released a second and presumably final round of draft regulations under the California Consumer Privacy Act (CCPA) that reflect how unstructured electronic data will be treated under the Act. The proposed rules outline how the California AG is interpreting and will be enforcing the CCPA. Under § 999.313(d)(2), data from archived or backup systems are—unlike the GDPR—exempt from the CCPA’s scope, unless those archives are restored and become active. Additional guidance from the Attorney General states: “Allowing businesses to delete the consumer’s personal information on archived or backup systems at the time that they are accessed or used balances the interests of consumers with the potentially burdensome costs of deleting information from backup systems that may never be utilized.”

What is very notable is that the only technical exception to the CCPA is unrestored archived and back-up data. Like the GDPR, there is no distinction between unstructured and structured electronic data. In the first round of public comments, an insurance industry lobbying group argued that unstructured data be exempted from the CCPA. As reflected by revised guidance, that suggestion was rejected by the California AG.

For the GDPR, the UK ICO correctly advises that enterprises “should ensure that your information management systems are well-designed and maintained, so you can efficiently locate and extract information requested by the data subjects whose personal data you process and redact third party data where it is deemed necessary.” This is why Forrester Research notes that “Data Discovery and Classification are the foundation for GDPR compliance.”

Establish and Enforce Data Privacy Policies

So to achieve GDPR and CCPA compliance, organizations must first ensure that explicit policies and procedures are in place for handling personal information. Once established, it is important to demonstrate to regulators that such policies and procedures are being followed and operationally enforced. A key first step is to establish a data map of where and how personal data is stored in the enterprise. This exercise is actually required under the GDPR Article 30 documentation provisions.

An operational data audit and discovery capability across unstructured data sources allows enterprises to efficiently map, identify, and remediate personal information in order to respond to regulators and data subject access requests from EU and California citizens. This capability must be able to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of weeks or months as is the case with traditional crawling tools. This includes laptops of employees working from home.

These processes and capabilities are not only required for data privacy compliance but are also needed for broader information governance and security requirements, anti-fraud compliance, and e-discovery.

Implementing these measures proactively, with routine and consistent enforcement using solutions such as X1 Distributed GRC, will go a long way to mitigate risk, respond efficiently to data subject access requests, and improve overall operational effectiveness through such overall information governance improvements.

Leave a comment

Filed under CaCPA, compliance, Corporations, Cyber security, Cybersecurity, Data Audit, GDPR, Information Governance, Information Management, Uncategorized

How the Remote Workforce Impacts GDPR and CCPA Compliance

By John Patzakis

While our personal and business lives will hopefully return to normal soon, COVID-19 is only accelerating the trend of an increasingly remote and distributed workforce. This “new normal” will necessitate relying on the latest technology and updated workflows to comply with legal, privacy, and information governance requirements, including the GDPR and similar US-based laws.

A core requirement of both the GDPR and the similar California Consumer Privacy Act is the ability to demonstrate and prove that personal data is being protected, thus requiring information governance capabilities that allow companies to efficiently identify and remediate personal data of EU and California residents. For instance, the UK Information Commissioners Office (ICO) provides that “The GDPR places a high expectation on you to provide information in response to a SAR (Subject Access Request). Whilst it may be challenging, you should make extensive efforts to find and retrieve the requested information.”[1]CCPA Image

Under the GDPR, there is no distinction between structured versus unstructured electronic data in terms of the regulation’s scope. The key consideration is whether a data controller or processor has control over personal data, regardless of where it is located in the organization.

The UK ICO, a key government regulator that interprets and enforces the GDPR, recently issued important draft guidance on the scope of GDPR data subject access rights, including as it relates to unstructured electronic information. Notably, the ICO notes that “emails stored on your computer are a form of electronic record to which the general principles (under the GDPR) apply.” In fact, the ICO notes that home computers and personal email accounts of employees are subject to GDPR if they contain personal data originating from the employers networks or processing activities.[2]

CCPA          

The California Attorney General released second and presumably final round draft regulations under the California Consumer Privacy Act (CCPA) that reflect how unstructured electronic data will be treated under the Act.[3] The proposed rules outline how the California AG is interpreting and will be enforcing the CCPA. Under § 999.313(d)(2) data from archived or backup systems are —unlike the GDPR— exempt from the CCPA’s scope, unless those archives are restored and become active: “A business shall comply with a consumer’s request to delete their personal information by: a. Permanently and completely erasing the personal information on its existing systems with the exception of archived or back-up systems.”

What is very notable is that the only technical exception to the CCPA is unrestored archived and back-up data. Like the GDPR, there is no distinction between unstructured and structured electronic data. The CCPA guidance broadly provides that companies must permanently delete personal information from their “existing systems.” In the first round of public comments, an insurance industry lobbying group argued that unstructured data be exempted from the CCPA. As reflected by revised guidance, that suggestion was rejected by the California Attorney General.

Further to this point, AMLaw 100 firm Davis Wright Tremaine provides public guidance on the CCPA as follows: “Access requests may be easier for companies that maintain databases, but most companies also collect unstructured data (such as emails, images, files, etc.) related to consumers. Given that ‘personal information’ includes any information capable of being associated with a consumer or a household, requests will encompass a wide range of data that a business possesses.”[4]

So to achieve GDPR and CCPA compliance, organizations must ensure not only that explicit policies and procedures are in place for handling personal information, but also the ability to prove that those policies and procedures are being followed and operationally enforced. The new normal of remote workforces is a critical challenge that must be addressed.

What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, including on laptops and other unstructured data maintained by remote workforces, through the ability to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of days or weeks. The need for such an operational capability provided by best practices technology is further heightened by the urgency of CCPA and GDPR compliance.

Solving this collection challenge is X1 Distributed Discovery, which is specially designed to address the challenges presented by remote and distributed workforces.  X1 Distributed Discovery (X1DD) enables enterprises to quickly and easily search across up to thousands of distributed endpoints and data servers from a central location.  Legal and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, and full results with completed collection in hours, instead of days or weeks.

To learn more about this capability purpose-built for remote eDiscovery collection and data audits, please contact us.

NOTES:

[1] https://ico.org.uk/media/about-the-ico/consultations/2616442/right-of-access-draft-consultation-20191204.pdf

[2] Id.

[3] https://oag.ca.gov/sites/all/files/agweb/pdfs/privacy/ccpa-text-of-second-set-clean-031120.pdf?

[4] https://www.dwt.com/blogs/privacy–security-law-blog/2019/07/consumer-rights-under-to-ccpa-part-1-what-are-they

Leave a comment

Filed under Best Practices, CaCPA, compliance, Data Audit, GDPR, Uncategorized