Remote Collection: The Apple Pay of eDiscovery in a COVID-19 World

By: Craig Carpenter

I often continue doing things just because that’s the way I’ve always done them.  There is a level of comfort that comes from familiarity, and to be honest as I age I realize I can get more set in my ways (as my children often tell me), eschewing new ways of doing things – even if they are quicker or more efficient.  Sometimes it takes a major disruption to force change, as the eDiscovery market saw with accelerated adoption of Predictive Coding in the wake of the Great Recession.  This is true in many industries, including consumer products: witness the accelerated adoption of “contactless payment” like Apple Pay during the COVID-19 pandemic.  It has been available for years, but adopted mainly by younger generations while us old folks clung to credit cards and, in some cases, cash (gasp!).  But COVID-19 has changed this dynamic for many, myself included, as the prospect of touching a credit card machine is now unacceptable.  Whereas using Apple Pay was a ‘nice-to-have’ before COVID-19, it has become a ‘must-have’ now.  This type of resistance to change is arguably even more commonplace in the legal world, where convention and comfort often reign supreme.  How we have been conducting eDiscovery collection for years is a perfect example of clinging to outdated methods – but with the advent of COVID-19, this too is about to change for good.

Collection of digital evidence in legal proceedings was an implicit requirement under the Federal Rules of Civil Procedure (FRCP) long before it was codified explicitly in the 2006 amendments with the addition of Electronically Stored Information (ESI) under amended Rule 34(a)) as a “new” category.  I distinctly remember conducting discovery in 1998 and 1999 as a 3rd year law student and then 1st year associate for a Bay Area law firm: it was the proverbial “banker box” process, with all discovery in paper form.  In those days, even email messages and Word Perfect documents were simply printed out to be Bates stamped and reviewed in hard copy by hand.  Document review has always been tedious, but at least back then the volumes were significantly lower than they are these days.

During this timeframe, however, email and the dissemination of ever-greater volumes of electronic information it facilitated was exploding.  This, of course, meant that evidence (in the forensic context) and relevant information for eDiscovery was increasingly digital in nature.  So when discovery practitioners went looking for tools to help them preserve and collect digital information, where did they turn?  To the forensic world, of course, as the more stringent requirements and processes of criminal proceedings and evidence necessitated the development of such tools earlier than had been needed in civil discovery.  And if a tool was good enough for criminal proceedings, it should be plenty good enough for those in the civil world.  Thus, forensic tools like Guidance’s Encase® and AccessData’s FTK® which were built for law enforcement crossed over into the civil world.

However, the needs of the data collection process for civil discovery were and remain quite different from those of the criminal world:

  • On average civil discovery involves far more “custodians” (owners or stewards of information) than criminal proceedings, e.g. 5-15 custodians in civil matters vs. 1, maybe 2, in criminal
  • Whereas a typical criminal proceeding focuses on the communication media of one or occasionally a few alleged perpetrators (i.e. their cell phone, laptop, social media), civil discovery is typically significantly broader given the greater number of corporation applications and data repositories, including corporate email, file shares, ‘loose files’ (e.g. Word or Excel documents only stored locally), cloud storage repositories like Dropbox or Google Vault
  • Due to the larger number of custodians and typically broader data types to be searched, the volume of information in civil discovery is usually significantly greater than in a criminal proceeding
  • In handling criminal evidence there is a presumption that the alleged perpetrator may have tried to hide, alter or destroy evidence; absent very unusual circumstances, no such presumption exists in civil discovery
  • While confiscation of devices (laptops, desktops, cell phones, records) is the standard in criminal proceedings, the opposite is true in civil discovery. Custodians need their devices so they can do their jobs
  • Collection of evidence in criminal proceedings is handled by law enforcement (e.g. upon arrest or as part of a ‘dawn raid’ type of event), while the parties themselves conduct civil discovery (as a business process typically handled by legal or outsourced to service providers)

These differences were insignificant when data volumes were small and the data was relatively easy to get to, as was the case for many years.  And as the first technology on the market, forensic tools and vendors did a great job of building and defending their incumbency, through certifications, “court-cited workflows” and knowledge bases widely advertising their deep expertise in forensic collection as practiced by a cadre of forensic examiners leveraging their technical abilities into lucrative careers – thereby creating a significant barrier to entry for non-forensic eDiscovery collection tools and practitioners.

In spite of this strong incumbency, almost all corporate legal departments have long wanted a better approach to collection than forensic tools offered; many of their outside counsel have felt similarly.  They have long felt collection using forensic tools and workflows were and remain deeply flawed for eDiscovery in a number of ways:

  • Chronic overcollection: as forensic tools were built to capture all information, including things like slack space which can be important in criminal proceedings but are almost never even in scope in civil matters, the volume of data collected is far greater than needed. While service providers charging hourly professional services time and monthly per-GB hosting fees may not mind, for clients paying to collect/filter/host/review/produce knowingly unnecessary data this makes no sense and adds significant cost to the entire process, each and every time
  • Weeks or months-long process: because forensic tools must process data on a server before searching or culling it, they require physical access to a device (e.g. via a USB port). There is an option to copy entire drives with GBs of data through a VPN connection, but this approach has never worked well, if at all.  Given the coordination needed to gain physical access to devices which may be located in myriad different cities or countries, as well as the need to complete collection before paring down or even searching of data can begin, what should take hours or days instead takes weeks if not months
  • Highly disruptive: as each forensic image is being taken of each laptop or desktop, the user of each such machine must stop whatever they are doing and surrender their machine to the forensic staff for a day or more. Even if there is a spare laptop available, it will often have none of their ‘stuff’ on it.  Needless to say, this highly intrusive process makes each such worker far less productive and is very disruptive
  • “Recreating the wheel” every time: when the next matter arrives, can forensic examiners simply use the data from the last collection? Unfortunately, no, as each custodian has presumably created and received new data, necessitating the whole process from before be repeated.  Forensic collection quite literally recreates the wheel with every collection

By contrast, remote collection is designed specifically for civil eDiscovery.  It is built for a distributed workforce and requires no physical access to any devices.  A small software agent is installed on each device which creates its own local index; legal staff can then simply search this index for whatever ESI they want to find.  This distributed architecture facilitates ‘Pre-Case Assessment’, where search terms are sampled on data in-place, before any ESI is collected.  This turns the forensic collection workflow on its head, as analysis can be done from the very beginning of the preservation/collection process, allowing lawyers to gain insight far earlier in any proceeding and supporting a surgical collection process, leading to far lower data volumes (and therefore much lower eDiscovery costs).  And because remote collection can be an entirely cloud-based process, no hardware or specialized staff is required – in fact, collections can be done without IT ever being involved.

Why hasn’t the industry adopted remote collection before now?  Because everyone involved in the process except the client was benefited from it: forensic experts, service providers and forensic technology providers.  They had a strong incentive to keep things as they had always been, to the client’s detriment.  In a COVID-19 world, however, even these groups must change their workflows as physical access to devices has not only fallen out of favor – it is now impossible and perhaps even dangerous.  What remote employee would want a stranger to come to their home and take their laptop for hours?  That scenario is simply no longer an option.  Similarly to how touching a point-of-sale machine went from a minor inconvenience to a wildly irresponsible and even dangerous activity when Apple Pay is a far better approach, forensic collection in eDiscovery is in the process of giving way to remote collection.  Clients will be much better off for it.

Leave a comment

Filed under Best Practices, collection, Corporations, ESI, Information Access, Preservation & Collection, Uncategorized

How the Remote Workforce Impacts GDPR and CCPA Compliance

By John Patzakis

While our personal and business lives will hopefully return to normal soon, COVID-19 is only accelerating the trend of an increasingly remote and distributed workforce. This “new normal” will necessitate relying on the latest technology and updated workflows to comply with legal, privacy, and information governance requirements, including the GDPR and similar US-based laws.

A core requirement of both the GDPR and the similar California Consumer Privacy Act is the ability to demonstrate and prove that personal data is being protected, thus requiring information governance capabilities that allow companies to efficiently identify and remediate personal data of EU and California residents. For instance, the UK Information Commissioners Office (ICO) provides that “The GDPR places a high expectation on you to provide information in response to a SAR (Subject Access Request). Whilst it may be challenging, you should make extensive efforts to find and retrieve the requested information.”[1]CCPA Image

Under the GDPR, there is no distinction between structured versus unstructured electronic data in terms of the regulation’s scope. The key consideration is whether a data controller or processor has control over personal data, regardless of where it is located in the organization.

The UK ICO, a key government regulator that interprets and enforces the GDPR, recently issued important draft guidance on the scope of GDPR data subject access rights, including as it relates to unstructured electronic information. Notably, the ICO notes that “emails stored on your computer are a form of electronic record to which the general principles (under the GDPR) apply.” In fact, the ICO notes that home computers and personal email accounts of employees are subject to GDPR if they contain personal data originating from the employers networks or processing activities.[2]

CCPA          

The California Attorney General released second and presumably final round draft regulations under the California Consumer Privacy Act (CCPA) that reflect how unstructured electronic data will be treated under the Act.[3] The proposed rules outline how the California AG is interpreting and will be enforcing the CCPA. Under § 999.313(d)(2) data from archived or backup systems are —unlike the GDPR— exempt from the CCPA’s scope, unless those archives are restored and become active: “A business shall comply with a consumer’s request to delete their personal information by: a. Permanently and completely erasing the personal information on its existing systems with the exception of archived or back-up systems.”

What is very notable is that the only technical exception to the CCPA is unrestored archived and back-up data. Like the GDPR, there is no distinction between unstructured and structured electronic data. The CCPA guidance broadly provides that companies must permanently delete personal information from their “existing systems.” In the first round of public comments, an insurance industry lobbying group argued that unstructured data be exempted from the CCPA. As reflected by revised guidance, that suggestion was rejected by the California Attorney General.

Further to this point, AMLaw 100 firm Davis Wright Tremaine provides public guidance on the CCPA as follows: “Access requests may be easier for companies that maintain databases, but most companies also collect unstructured data (such as emails, images, files, etc.) related to consumers. Given that ‘personal information’ includes any information capable of being associated with a consumer or a household, requests will encompass a wide range of data that a business possesses.”[4]

So to achieve GDPR and CCPA compliance, organizations must ensure not only that explicit policies and procedures are in place for handling personal information, but also the ability to prove that those policies and procedures are being followed and operationally enforced. The new normal of remote workforces is a critical challenge that must be addressed.

What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, including on laptops and other unstructured data maintained by remote workforces, through the ability to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of days or weeks. The need for such an operational capability provided by best practices technology is further heightened by the urgency of CCPA and GDPR compliance.

Solving this collection challenge is X1 Distributed Discovery, which is specially designed to address the challenges presented by remote and distributed workforces.  X1 Distributed Discovery (X1DD) enables enterprises to quickly and easily search across up to thousands of distributed endpoints and data servers from a central location.  Legal and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, and full results with completed collection in hours, instead of days or weeks.

To learn more about this capability purpose-built for remote eDiscovery collection and data audits, please contact us.

NOTES:

[1] https://ico.org.uk/media/about-the-ico/consultations/2616442/right-of-access-draft-consultation-20191204.pdf

[2] Id.

[3] https://oag.ca.gov/sites/all/files/agweb/pdfs/privacy/ccpa-text-of-second-set-clean-031120.pdf?

[4] https://www.dwt.com/blogs/privacy–security-law-blog/2019/07/consumer-rights-under-to-ccpa-part-1-what-are-they

Leave a comment

Filed under Best Practices, CaCPA, compliance, Data Audit, GDPR, Uncategorized

Remote ESI Collection and Data Audits in the Time of Social Distancing

By John Patzakis

The vital global effort to contain the COVID-19 pandemic will likely disrupt our lives and workflows for some time. While our personal and business lives will hopefully return to normal soon, the trend of an increasingly remote and distributed workforce is here to stay. This “new normal” will necessitate relying on the latest technology and updated workflows to comply with legal, privacy, and information governance requirements.

From an eDiscovery perspective, the legacy manual collection workflow involving travel, physical access and one-time mass collection of custodian laptops, file servers and email accounts is a non-starter under current travel ban and social distancing policies, and does not scale for the new era of remote and distributed workforces going forward. In addition to the public health constraints, manual collection efforts are expensive, disruptive and time-consuming as many times an “overkill” method of forensic image collection process is employed, thus substantially driving up eDiscovery costs.

When it comes to technical approaches, endpoint forensic crawling methods are now a non-starter. Network bandwidth constraints coupled with the requirement to migrate all endpoint data back to the forensic crawling tool renders the approach ineffective, especially with remote workers needing to VPN into a corporate network.  Right now, corporate network bandwidth is at a premium, and the last thing a company needs is their network shut down by inefficient remote forensic tools.

For example, with a forensic crawling tool, to search a custodian’s laptop with 10 gigabytes of email and documents, all 10 gigabytes must be copied and transmitted over the network, where it is then searched, all of which takes at least several hours per computer. So, most organizations choose to force collect all 10 gigabytes. The case of U.S. ex rel. McBride v. Halliburton Co.  272 F.R.D. 235 (2011), Illustrates this specific pain point well. In McBride, Magistrate Judge John Facciola’s instructive opinion outlines Halliburton’s eDiscovery struggles to collect and process data from remote locations:

“Since the defendants employ persons overseas, this data collection may have to be shipped to the United States, or sent by network connections with finite capacity, which may require several days just to copy and transmit the data from a single custodian . . . (Halliburton) estimates that each custodian averages 15–20 gigabytes of data, and collection can take two to ten days per custodian. The data must then be processed to be rendered searchable by the review tool being used, a process that can overwhelm the computer’s capacity and require that the data be processed by batch, as opposed to all at once.”

Halliburton represented to the court that they spent hundreds of thousands of dollars on eDiscovery for only a few dozen remotely located custodians. The need to force-collect the remote custodians’ entire set of data and then sort it out through the expensive eDiscovery processing phase, instead of culling, filtering and searching the data at the point of collection drove up the costs.

Solving this collection challenge is X1 Distributed Discovery, which is specially designed to address the challenges presented by remote and distributed workforces.  X1 Distributed Discovery (X1DD) enables enterprises to quickly and easily search across up to thousands of distributed endpoints and data servers from a central location.  Legal and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, and full results with completed collection in hours, instead of days or weeks. The key to X1’s scalability is its unique ability to index and search data in place, thereby enabling a highly detailed and iterative search and analysis, and then only collecting data responsive to those steps. blog-relativity-collect-v3

X1DD operates on-demand where your data currently resides — on desktops, laptops, servers, or even the cloud — without disruption to business operations and without requiring extensive or complex hardware configurations. After indexing of systems has completed (typically a few hours to a day depending on data volumes), clients and their outside counsel or service provider may then:

  • Conduct Boolean and keyword searches of relevant custodial data sources for ESI, returning search results within minutes by custodian, file type and location.
  • Preview any document in-place, before collection, including any or all documents with search hits.
  • Remotely collect and export responsive ESI from each system directly into a Relativity® or RelativityOne® workspace for processing, analysis and review or any other processing or review platform via standard load file. Export text and metadata only or full native files.
  • Export responsive ESI directly into other analytics engines, e.g. Brainspace®, H5® or any other platform that accepts a standard load file.
  • Conduct iterative “search/analyze/export-into-Relativity” processes as frequently and as many times as desired.

To learn more about this capability purpose-built for remote eDiscovery collection and data audits, please contact us.

Leave a comment

Filed under Best Practices, Case Law, Case Study, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, Information Governance, Preservation & Collection, Relativity

Court Compels Forensic Imaging of Custodian Computer, Imposes Sanctions Due to Non-Defensible eDiscovery Preservation Process

By John Patzakis

HealthPlan Servs., Inc. v. Dixit, et al., 2019 WL 6910139 (M.D. Fla. Dec. 19, 2019), is an important eDiscovery case addressing what is required and expected from organizations to comply with electronic evidence discovery collection requirements. In this copyright infringement and breach of contract case, a Federal Magistrate Judge granted the plaintiff’s motion to compel immediate inspection of a defendant employee Feron Kutsomarkos’s laptop after the defendants failed to properly preserve and collect evidence from her. The Court granted plaintiff’s motion to compel the forensic examination, which set forth specific improprieties in their opponent’s ESI preservation process. The Court also granted the plaintiff’s motion for fees, sanctions, and a punitive jury instruction.

 

There are several key takeaways from this case. Here are the top 5:

  1. Custodian Self-Collection Is Not Defensible

Ms. Kutsomarkos conducted her own search of the emails rather than having an expert or trained IT or legal staff overseen by her attorney perform the search. The court found this process to not be defensible as the production “should have come from a professional search of the laptop” instead. This is yet another case disapproving of this faulty practice. For instance, another company found themselves on the wrong end of a $3 million sanctions penalty for spoliation of evidence because they improperly relied on custodians to search and collect Federal Court their own data. See GN Netcom, Inc. v. Plantronics, Inc., No. 12-1318-LPS, 2016 U.S. Dist. LEXIS 93299 (D. Del. July 12, 2016). Even with effective monitoring, severe defensibility concerns plague custodian self-collection, with several courts disapproving of the practice due to poor compliance and inconsistency of results. See Green v. Blitz, 2011 WL 806011, (E.D. Tex. Mar. 1, 2011), Nat’l Day Laborer Org. v. U.S. Immigration and Customs Enforcement Agency, 2012 WL 2878130 (S.D.N.Y. July 13, 2012).

  1. Producing Party Expected to Produce Their Own Data in a Defensible Manner

When responding to a litigation discovery request, the producing party is afforded the opportunity to produce their own data. However, the process must be defensible with a requisite degree of transparency and validation. When an organization does not have a systematic and repeatable process in place, the risks and costs associated with eDiscovery increase exponentially.  Good attorneys and the eDiscovery professionals who work with them will not only ensure their client complies with their own eDiscovery requirements, but will also scrutinize the opponent’s process and gain a critical advantage when the opponent fails to meet their obligations.

And that is what happened here. The corporate defendants had no real process other than telling key custodians to search and collect their own data. The eDiscovery-savvy plaintiff counsel filed motions poking large holes in the defendant’s process and won a likely case-deciding ruling. The stakes are high in such litigation matters and it is incumbent upon counsel to have a high degree of eDiscovery competence for both defensive and offensive purposes.

  1. Forensic Imaging is The Exception, Not the Rule

The court compelled the forensic imaging of a defendant’s laptop, but only as a punitive measure after determining bad faith non-compliance. Section 8c of The Sedona Principles, Third Edition: Best Practices, Recommendations & Principles for Addressing Electronic Document Production, provides that: “Forensic data collection requires intrusive access to desktop, server, laptop, or other hard drives or media storage devices.”  While noting the practice is acceptable in some limited circumstances, “making a forensic copy of computers is only the first step of an expensive, complex, and difficult process of data analysis . . . it should not be required unless circumstances specifically warrant the additional cost and burden and there is no less burdensome option available.”  The duty to preserve evidence, including ESI, extends only to relevant information. Parties that comply with discovery requirements will avoid burdensome and risk-laden forensic imaging.

  1. Metadata Must be Preserved

Metadata is required to be produced intact when designated by the requesting party, which is now commonplace. (See, Federal Rule of Civil Procedure 34(b)(1)(C)). Metadata is often relevant evidence itself and is also needed for accurate eDiscovery culling, processing and analysis. In her production, counsel for defendant Kutsomarkos provided pdf versions of documents from her laptop. However, the court found that “the pdf files scrubbed the metadata from the documents and that metadata should be available on the hard drives.” There are defensible and very cost effective ways to collect and preserve metadata. They were not used by the defendants, to their great detriment.

  1. A Defensible But Streamlined Process Is Optimal

HealthPlan Services, is yet another court decision underscoring the importance of a well-designed, cost-effective and defensible eDiscovery collection process. Such a capability is only attainable with the right enterprise technology. With X1 Distributed Discovery (X1DD), parties can perform targeted search and collection of the ESI of hundreds of endpoints over the internal network without disrupting operations. The search results are returned in minutes, not weeks, and thus can be highly granular and iterative, based upon multiple keywords, date ranges, file types, or other parameters. This approach typically reduces the eDiscovery collection and processing costs by at least one order of magnitude (90%), thereby bringing much needed feasibility to enterprise-wide eDiscovery collection that can save organizations millions while improving compliance by maintaining metadata, generating audit logs and establishing chain of custody.

And in line with concepts outlined in HealthPlan Services, X1DD provides a repeatable, verifiable and documented process for the requisite defensibility. For a demonstration or briefing on X1 Distributed Discovery, please contact us.

Leave a comment

Filed under Best Practices, Case Law, eDiscovery, Enterprise eDiscovery, ESI, Uncategorized