Category Archives: Uncategorized

Intelligent ESI Collection Integrated with Relativity Can Cut eDiscovery Costs by 90 Percent

By John Patzakis

One of the biggest drivers of excessive eDiscovery costs is ESI over-collection. This in turn leads to a larger amount of data entering the processing and initial review funnel. These traditional inefficient efforts are manual with numerous hand offs and a high degree of project management and consulting hours to oversee the disjointed workflow. A recent analysis by Compliance CEO Marc Zamsky, illustrated in the chart below, established that cost for collection, processing and first month hosting under a traditional preservation process can cost upwards of $12,000 per custodian:

Properly targeted preservation initiatives are permitted by the courts and can be enabled by next generation software that is able to quickly and effectively access and search these data sources in place and throughout the enterprise. The value of targeted preservation is recognized in the Committee Notes to the recent FRCP amendments, which urge the parties to reach agreement on the preservation of data and the key words, date ranges and other metadata to identify responsive materials. (Citing the Manual for Complex Litigation (MCL) (4th) §40.25(2)). And In re Genetically Modified Rice Litigation, the court noted that “[p]reservation efforts can become unduly burdensome and unreasonably costly unless those efforts are targeted to those documents reasonably likely to be relevant or lead to the discovery of relevant evidence.”

Recently we hosted a webinar with Compliance highlighting the very compelling integration of our X1 Distributed Discovery platform with Relativity. This X1/Relativity integration enables game-changing efficiencies in the eDiscovery process by accelerating speed to review, and providing an end-to-end process from identification through production. As recently stated by Relativity Chief Product Officer Chris Brown: “Our exciting new partnership with X1 highlights our continued commitment to providing a streamlined user experience from collection to production…RelativityOne users will be able to combine X1’s innovative endpoint technology with the performance of our SaaS platform, eliminating the cumbersome process of manual data hand-offs and allowing them to get to the pertinent data in their case – faster.”

The live demonstration highlighted in real time how the integration improves the enterprise eDiscovery collection and ECA process by enabling a targeted and efficient search and collection process, with immediate pre-collection visibility into custodian data. X1 Distributed Discovery significantly streamlines the eDiscovery workflow with integrated culling and deduplication, thereby eliminating the need for expensive and cumbersome electronically stored information (ESI) processing tools. That way, the ESI can be populated straight into Relativity from an X1 collection without multiple hand offs, extensive project management and inefficient data processing.

Zamsky commented that the “ability to collect directly from custodian laptops and desktops into a RelativityOne workspace without impacting custodians is a game-changer,” which will “reduce collection times from weeks to hours so that attorneys can quickly begin reviewing and analyzing ESI in RelativityOne.” In fact, Zamsky demonstrated just that by presenting a second chart showing how this streamlined approach, based upon a detailed ROI analysis, reduces eDiscovery costs by over 90 percent:

So in terms of the big picture, with this integration providing a complete platform for efficient data search, eDiscovery, and review across the enterprise, organizations will save a lot of time, save a lot of money, and be able to make faster and better decisions. When you accelerate the speed to review and eliminate over-collection and inefficient processing, you are going to have much better early insight into your data and increase efficiencies on many levels.

A recording of the X1/Relativity integration webinar can be accessed here.

Leave a comment

Filed under Best Practices, collection, eDiscovery, ESI, Uncategorized

Federal Judge: Custodian Self-Collection of ESI is Unethical and Violates Federal Rules of Civil Procedure

By John Patzakis

In E.E.O.C. v. M1 5100 Corp., (S.D. Fla. July 2, 2020), Federal District Judge Matthewman excoriated defense counsel for allowing the practice of unsupervised custodian ESI self-collection, declaring that the practice “greatly troubles and concerns the court.” In this EEOC age discrimination case, two employees of the defendant corporation were permitted to identify and collect their own ESI in an unsupervised manner. Despite no knowledge of the process the client undertook to gather information (which resulted in only 22 pages of documents produced), counsel signed the responses to the RFP’s in violation of FRCP Rule 26(g), which requires that the attorney have knowledge and supervision of the process utilized to collect data from their client in response to discovery requirements.Gavel and books

This notable quote from the opinion provides a very strong legal statement against the practice of ESI custodian self-collection:

“The relevant rules and case law establish that an attorney has a duty and obligation to have knowledge of, supervise, or counsel the client’s discovery search, collection, and production. It is clear to the Court that an attorney cannot abandon his professional and ethical duties imposed by the applicable rules and case law and permit an interested party or person to ‘self-collect’ discovery without any attorney advice, supervision, or knowledge of the process utilized. There is simply no responsible way that an attorney can effectively make the representations required under Rule 26(g)(1) and yet have no involvement in, or close knowledge of, the party’s search, collection and production of discovery…Abdicating completely the discovery search, collection and production to a layperson or interested client without the client’s attorney having sufficient knowledge of the process, or without the attorney providing necessary advice and assistance, does not meet an attorney’s obligation under our discovery rules and case law. Such conduct is improper and contrary to the Federal Rules of Civil Procedure.”

In his ruling, Judge Matthewman stated that he “will not permit an inadequate discovery search, collection and production of discovery, especially ESI, by any party in this case.” He gave the defendant “one last chance to comply with its discovery search, collection and production obligations.”  He then also ordered “the parties to further confer on or before July 9, 2020, to try to agree on relevant ESI sources, custodians, and search terms, as well as on a proposed ESI protocol.” The Court reserved ruling on monetary and evidentiary sanctions pending the results of Defendants second chance efforts.

A Defensible Yet Streamlined Process Is Optimal

EEOC v. M1 5100, is yet another court decision disallowing custodian self-collection of ESI and underscoring the importance of a well-designed and defensible eDiscovery collection process. At the other end of the spectrum, full disk image collection is another preservation option that, while being defensible, is very costly, burdensome and disruptive to operations. Previously in this blog, I discussed at length the numerous challenges associated with full disk imaging.

The ideal solution is a systemized, uniform and defensible process for ESI collection, which also enables targeted and intelligent data collection in support of proportionality principles. Such a capability is only attainable with the right enterprise technology. With X1 Distributed Discovery (X1DD), parties can perform targeted search and collection of the ESI of hundreds of endpoints over the internal network without disrupting operations. The search results are returned in minutes, not weeks, and thus can be highly granular and iterative, based upon multiple keywords, date ranges, file types, or other parameters. This approach typically reduces the eDiscovery collection and processing costs by at least one order of magnitude (90%), thereby bringing much needed feasibility to enterprise-wide eDiscovery collection that can save organizations millions while improving compliance by maintaining metadata, generating audit logs and establishing chain of custody.

And in line with the Judge’s guidance outlined in EEOC v. M1 5100, X1DD provides a repeatable, verifiable and documented process for the requisite defensibility. For a demonstration or briefing on X1 Distributed Discovery, please contact us.

Leave a comment

Filed under Best Practices, Case Law, collection, eDiscovery, ESI, Uncategorized

CCPA and GDPR UPDATE: Unstructured Enterprise Data in Scope of Compliance Requirements

An earlier version of this article appeared on Legaltech News

By John Patzakis

A core requirement of both the GDPR and the similar California Consumer Privacy Act (CCPA), which becomes enforceable on July 1, is the ability to demonstrate and prove that personal data is being protected. This requires information governance capabilities that allow companies to efficiently identify and remediate personal data of EU and California residents. For instance, the UK Information Commissioner’s Office (ICO) provides that “The GDPR places a high expectation on you to provide information in response to a SAR (Subject Access Request). Whilst it may be challenging, you should make extensive efforts to find and retrieve the requested information.”CCPA GDPR

However, recent Gartner research notes that approximately 80% of information stored by companies is “dark data” that is in the form of unstructured, distributed data that can pose significant legal and operational risks. With much of the global workforce now working remotely, this is of special concern and nearly all the company data maintained and utilized by remote employees is in the form of unstructured data. Unstructured enterprise data generally refers to searchable data such as emails, spreadsheets and documents on laptops, file servers, and social media.

The GDPR

An organization’s GDPR compliance efforts need to address any personal data contained within unstructured electronic data throughout the enterprise, as well as the structured data found in CRM, ERP and various centralized records management systems. Personal data is defined in the GDPR as: “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”

Under the GDPR, there is no distinction between structured versus unstructured electronic data in terms of the regulation’s scope. There is a separate guidance regarding “structured” paper records (more on that below). The key consideration is whether a data controller or processor has control over personal data, regardless of where it is located in the organization. Nonetheless, there is some confusion about the scope of the GDPR’s coverage across structured as well as unstructured electronic data systems.

The UK ICO is a key government regulator that interprets and enforces the GDPR, and has recently issued important draft guidance on the scope of GDPR data subject access rights, including as it relates to unstructured electronic information. Notably, the ICO notes that large data sets, including data analytics outputs and unstructured data volumes, “could make it more difficult for you to meet your obligations under the right of access. However, these are not classed as exemptions, and are not excuses for you to disregard those obligations.”

Additionally the ICO guidance advises that “emails stored on your computer are a form of electronic record to which the general principles (under the GDPR) apply.” In fact, the ICO notes that home computers and personal email accounts of employees are subject to GDPR if they contain personal data originating from the employers networks or processing activities. This is especially notable under the new normal of social distancing, where much of a company’s data (and associated personal information) is being stored on remote employee laptops.

The ICO also provides guidance on several related subjects that shed light on its stance regarding unstructured data:

Archived Data: According to the ICO, data stored in electronic archives is generally subject to the GDPR, noting that there is no “technology exemption” from the right of access. Enterprises “should have procedures in place to find and retrieve personal data that has been electronically archived or backed up.” Further, enterprises “should use the same effort to find information to respond to a SAR as you would to find archived or backed-up data for your own purposes.”

Deleted Data: The ICO’s view on deleted data is that it is generally within the scope of GDPR compliance, provided that there is no intent to, or a systematic ability to readily recover that data. The ICO says it “will not seek to take enforcement action against an organisation that has failed to use extreme measures to recreate previously ‘deleted’ personal data held in electronic form. We do not require organisations to use time and effort reconstituting information that they have deleted as part of their general records management.”

However, under this guidance organizations that invest in and deploy re-purposed computer forensic tools that feature automated un-delete capabilities may be held to a higher standard. Deploying such systems can reflect intent to as well as having the systematic technical ability to recover deleted data.

Paper Records: Paper records that are part of a “structured filing system” are subject to the GDPR. Specifically, if an enterprise holds “information about the requester in non-electronic form (e.g. in paper files or on microfiche records)” then such hard-copy records are considered personal data accessible via the right of access,” if such records are “held in a ‘filing system.” This segment of the guidance reflects that references to “unstructured data” in European parlance usually pertains to paper records. The ICO notes in separate guidance that “the manual processing of unstructured personal data, such as unfiled handwritten notes on paper” are outside the scope of GDPR.

GDPR Article 4 defines a “filing system” as meaning “any structured set of personal data which are accessible according to specific criteria, whether centralized, decentralized or dispersed on a functional or geographical basis.” The only form of “unstructured data” that would not be subject to GDPR would be unfiled paper records like handwritten notes or legacy microfiche.

The CCPA  

The California Attorney General (AG) released a second and presumably final round of draft regulations under the California Consumer Privacy Act (CCPA) that reflect how unstructured electronic data will be treated under the Act. The proposed rules outline how the California AG is interpreting and will be enforcing the CCPA. Under § 999.313(d)(2), data from archived or backup systems are—unlike the GDPR—exempt from the CCPA’s scope, unless those archives are restored and become active. Additional guidance from the Attorney General states: “Allowing businesses to delete the consumer’s personal information on archived or backup systems at the time that they are accessed or used balances the interests of consumers with the potentially burdensome costs of deleting information from backup systems that may never be utilized.”

What is very notable is that the only technical exception to the CCPA is unrestored archived and back-up data. Like the GDPR, there is no distinction between unstructured and structured electronic data. In the first round of public comments, an insurance industry lobbying group argued that unstructured data be exempted from the CCPA. As reflected by revised guidance, that suggestion was rejected by the California AG.

For the GDPR, the UK ICO correctly advises that enterprises “should ensure that your information management systems are well-designed and maintained, so you can efficiently locate and extract information requested by the data subjects whose personal data you process and redact third party data where it is deemed necessary.” This is why Forrester Research notes that “Data Discovery and Classification are the foundation for GDPR compliance.”

Establish and Enforce Data Privacy Policies

So to achieve GDPR and CCPA compliance, organizations must first ensure that explicit policies and procedures are in place for handling personal information. Once established, it is important to demonstrate to regulators that such policies and procedures are being followed and operationally enforced. A key first step is to establish a data map of where and how personal data is stored in the enterprise. This exercise is actually required under the GDPR Article 30 documentation provisions.

An operational data audit and discovery capability across unstructured data sources allows enterprises to efficiently map, identify, and remediate personal information in order to respond to regulators and data subject access requests from EU and California citizens. This capability must be able to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of weeks or months as is the case with traditional crawling tools. This includes laptops of employees working from home.

These processes and capabilities are not only required for data privacy compliance but are also needed for broader information governance and security requirements, anti-fraud compliance, and e-discovery.

Implementing these measures proactively, with routine and consistent enforcement using solutions such as X1 Distributed GRC, will go a long way to mitigate risk, respond efficiently to data subject access requests, and improve overall operational effectiveness through such overall information governance improvements.

Leave a comment

Filed under CaCPA, compliance, Corporations, Cyber security, Cybersecurity, Data Audit, GDPR, Information Governance, Information Management, Uncategorized

True Proportionality for eDiscovery Requires Smart Pre-Collection Analysis

By John Patzakis

Proportionality-based eDiscovery is a goal that all judges and corporate attorneys want to attain. Under Federal Rule of Civil Procedure 26(b)(1), parties may  discover any non-privileged material that is relevant to any party’s claim or defense and proportional to the needs of the case. However attorneys representing enterprises are essentially flying blind on this analysis when it matters most. Prior to the custodian data being actually collected, processed and analyzed, attorneys do not have any real visibility into the potentially relevant ESI across an organization. This is especially true in regard to unstructured, distributed data, which is invariably the majority of ESI that is ultimately collected in a given matter.proportionality

If accurate pre-collection data insight were available to counsel, that game-changing factor would enable counsel to set reasonable discovery limits and ultimately process, host, review and produce much less ESI.  Counsel can further use pre-collection proportionality analysis to gather key information, develop a litigation budget, and better manage litigation deadlines. Such insights can also foster cooperation by informing the parties early in the process about where relevant ESI is located, and what keywords and other search parameters can identify and pinpoint relevant ESI.

The problem is any keyword protocols are mostly guesswork at the early stage of litigation, as, under outdated but still widely used eDiscovery practices, the costly and time consuming steps of actual data collection and processing must occur before meaningful proportionality analysis can take place. When you hear eDiscovery practitioners talk about proportionality, they are invariably speaking of a post-collection, pre-review process. But without requisite pre-collection visibility into distributed ESI, counsel typically resort to directing broad collection efforts, resulting in much greater costs, burden and delays.

X1 recently hosted a webinar featuring prominent industry experts including attorney David Horrigan of Relativity, Mandi Ross of Prism Litigation Technology and Ben Sexton of JND eDiscovery, addressing the issues of remote ESI collection and proportionality. David Horrigan outlined in succinct detail the legal concepts of proportionality under the Federal Rules, the Sedona Principles and as applied in case law. Mandi Ross explained how she applies proportionality when advising lawyers and judges through custodian interviews, coupled with detailed keyword search term analysis based upon the matter’s specific claims and defenses. She noted that technology such as X1 greatly enables the application of her practice in real time: “The ability to index in place is a game changer because we have the ability to gain insight into the data and validate custodian interview data without first requiring that data to be collected.”

The webinar also featured a live exercise performing a pre-collection proportionality analysis on remote employee data with X1 Distributed Discovery. The panelists provided comments and insights contrasting what they saw with the outdated, costly, and time consuming process involving manual data collection and subsequent migration into a hardware processing appliance. The later process negates counsel’s ability to conduct any meaningful application of proportionality, without first incurring significant expense and loss of time. A recording of the webinar can be accessed here.

Leave a comment

Filed under Best Practices, eDiscovery, Uncategorized

Remote Collection: The Apple Pay of eDiscovery in a COVID-19 World

By: Craig Carpenter

I often continue doing things just because that’s the way I’ve always done them.  There is a level of comfort that comes from familiarity, and to be honest as I age I realize I can get more set in my ways (as my children often tell me), eschewing new ways of doing things – even if they are quicker or more efficient.  Sometimes it takes a major disruption to force change, as the eDiscovery market saw with accelerated adoption of Predictive Coding in the wake of the Great Recession.  This is true in many industries, including consumer products: witness the accelerated adoption of “contactless payment” like Apple Pay during the COVID-19 pandemic.  It has been available for years, but adopted mainly by younger generations while us old folks clung to credit cards and, in some cases, cash (gasp!).  But COVID-19 has changed this dynamic for many, myself included, as the prospect of touching a credit card machine is now unacceptable.  Whereas using Apple Pay was a ‘nice-to-have’ before COVID-19, it has become a ‘must-have’ now.  This type of resistance to change is arguably even more commonplace in the legal world, where convention and comfort often reign supreme.  How we have been conducting eDiscovery collection for years is a perfect example of clinging to outdated methods – but with the advent of COVID-19, this too is about to change for good.

Collection of digital evidence in legal proceedings was an implicit requirement under the Federal Rules of Civil Procedure (FRCP) long before it was codified explicitly in the 2006 amendments with the addition of Electronically Stored Information (ESI) under amended Rule 34(a)) as a “new” category.  I distinctly remember conducting discovery in 1998 and 1999 as a 3rd year law student and then 1st year associate for a Bay Area law firm: it was the proverbial “banker box” process, with all discovery in paper form.  In those days, even email messages and Word Perfect documents were simply printed out to be Bates stamped and reviewed in hard copy by hand.  Document review has always been tedious, but at least back then the volumes were significantly lower than they are these days.

During this timeframe, however, email and the dissemination of ever-greater volumes of electronic information it facilitated was exploding.  This, of course, meant that evidence (in the forensic context) and relevant information for eDiscovery was increasingly digital in nature.  So when discovery practitioners went looking for tools to help them preserve and collect digital information, where did they turn?  To the forensic world, of course, as the more stringent requirements and processes of criminal proceedings and evidence necessitated the development of such tools earlier than had been needed in civil discovery.  And if a tool was good enough for criminal proceedings, it should be plenty good enough for those in the civil world.  Thus, forensic tools like Guidance’s Encase® and AccessData’s FTK® which were built for law enforcement crossed over into the civil world.

However, the needs of the data collection process for civil discovery were and remain quite different from those of the criminal world:

  • On average civil discovery involves far more “custodians” (owners or stewards of information) than criminal proceedings, e.g. 5-15 custodians in civil matters vs. 1, maybe 2, in criminal
  • Whereas a typical criminal proceeding focuses on the communication media of one or occasionally a few alleged perpetrators (i.e. their cell phone, laptop, social media), civil discovery is typically significantly broader given the greater number of corporation applications and data repositories, including corporate email, file shares, ‘loose files’ (e.g. Word or Excel documents only stored locally), cloud storage repositories like Dropbox or Google Vault
  • Due to the larger number of custodians and typically broader data types to be searched, the volume of information in civil discovery is usually significantly greater than in a criminal proceeding
  • In handling criminal evidence there is a presumption that the alleged perpetrator may have tried to hide, alter or destroy evidence; absent very unusual circumstances, no such presumption exists in civil discovery
  • While confiscation of devices (laptops, desktops, cell phones, records) is the standard in criminal proceedings, the opposite is true in civil discovery. Custodians need their devices so they can do their jobs
  • Collection of evidence in criminal proceedings is handled by law enforcement (e.g. upon arrest or as part of a ‘dawn raid’ type of event), while the parties themselves conduct civil discovery (as a business process typically handled by legal or outsourced to service providers)

These differences were insignificant when data volumes were small and the data was relatively easy to get to, as was the case for many years.  And as the first technology on the market, forensic tools and vendors did a great job of building and defending their incumbency, through certifications, “court-cited workflows” and knowledge bases widely advertising their deep expertise in forensic collection as practiced by a cadre of forensic examiners leveraging their technical abilities into lucrative careers – thereby creating a significant barrier to entry for non-forensic eDiscovery collection tools and practitioners.

In spite of this strong incumbency, almost all corporate legal departments have long wanted a better approach to collection than forensic tools offered; many of their outside counsel have felt similarly.  They have long felt collection using forensic tools and workflows were and remain deeply flawed for eDiscovery in a number of ways:

  • Chronic overcollection: as forensic tools were built to capture all information, including things like slack space which can be important in criminal proceedings but are almost never even in scope in civil matters, the volume of data collected is far greater than needed. While service providers charging hourly professional services time and monthly per-GB hosting fees may not mind, for clients paying to collect/filter/host/review/produce knowingly unnecessary data this makes no sense and adds significant cost to the entire process, each and every time
  • Weeks or months-long process: because forensic tools must process data on a server before searching or culling it, they require physical access to a device (e.g. via a USB port). There is an option to copy entire drives with GBs of data through a VPN connection, but this approach has never worked well, if at all.  Given the coordination needed to gain physical access to devices which may be located in myriad different cities or countries, as well as the need to complete collection before paring down or even searching of data can begin, what should take hours or days instead takes weeks if not months
  • Highly disruptive: as each forensic image is being taken of each laptop or desktop, the user of each such machine must stop whatever they are doing and surrender their machine to the forensic staff for a day or more. Even if there is a spare laptop available, it will often have none of their ‘stuff’ on it.  Needless to say, this highly intrusive process makes each such worker far less productive and is very disruptive
  • “Recreating the wheel” every time: when the next matter arrives, can forensic examiners simply use the data from the last collection? Unfortunately, no, as each custodian has presumably created and received new data, necessitating the whole process from before be repeated.  Forensic collection quite literally recreates the wheel with every collection

By contrast, remote collection is designed specifically for civil eDiscovery.  It is built for a distributed workforce and requires no physical access to any devices.  A small software agent is installed on each device which creates its own local index; legal staff can then simply search this index for whatever ESI they want to find.  This distributed architecture facilitates ‘Pre-Case Assessment’, where search terms are sampled on data in-place, before any ESI is collected.  This turns the forensic collection workflow on its head, as analysis can be done from the very beginning of the preservation/collection process, allowing lawyers to gain insight far earlier in any proceeding and supporting a surgical collection process, leading to far lower data volumes (and therefore much lower eDiscovery costs).  And because remote collection can be an entirely cloud-based process, no hardware or specialized staff is required – in fact, collections can be done without IT ever being involved.

Why hasn’t the industry adopted remote collection before now?  Because everyone involved in the process except the client was benefited from it: forensic experts, service providers and forensic technology providers.  They had a strong incentive to keep things as they had always been, to the client’s detriment.  In a COVID-19 world, however, even these groups must change their workflows as physical access to devices has not only fallen out of favor – it is now impossible and perhaps even dangerous.  What remote employee would want a stranger to come to their home and take their laptop for hours?  That scenario is simply no longer an option.  Similarly to how touching a point-of-sale machine went from a minor inconvenience to a wildly irresponsible and even dangerous activity when Apple Pay is a far better approach, forensic collection in eDiscovery is in the process of giving way to remote collection.  Clients will be much better off for it.

Leave a comment

Filed under Best Practices, collection, Corporations, ESI, Information Access, Preservation & Collection, Uncategorized