Category Archives: Enterprise eDiscovery

Remote ESI Collection and Data Audits in the Time of Social Distancing

By John Patzakis

The vital global effort to contain the COVID-19 pandemic will likely disrupt our lives and workflows for some time. While our personal and business lives will hopefully return to normal soon, the trend of an increasingly remote and distributed workforce is here to stay. This “new normal” will necessitate relying on the latest technology and updated workflows to comply with legal, privacy, and information governance requirements.

From an eDiscovery perspective, the legacy manual collection workflow involving travel, physical access and one-time mass collection of custodian laptops, file servers and email accounts is a non-starter under current travel ban and social distancing policies, and does not scale for the new era of remote and distributed workforces going forward. In addition to the public health constraints, manual collection efforts are expensive, disruptive and time-consuming as many times an “overkill” method of forensic image collection process is employed, thus substantially driving up eDiscovery costs.

When it comes to technical approaches, endpoint forensic crawling methods are now a non-starter. Network bandwidth constraints coupled with the requirement to migrate all endpoint data back to the forensic crawling tool renders the approach ineffective, especially with remote workers needing to VPN into a corporate network.  Right now, corporate network bandwidth is at a premium, and the last thing a company needs is their network shut down by inefficient remote forensic tools.

For example, with a forensic crawling tool, to search a custodian’s laptop with 10 gigabytes of email and documents, all 10 gigabytes must be copied and transmitted over the network, where it is then searched, all of which takes at least several hours per computer. So, most organizations choose to force collect all 10 gigabytes. The case of U.S. ex rel. McBride v. Halliburton Co.  272 F.R.D. 235 (2011), Illustrates this specific pain point well. In McBride, Magistrate Judge John Facciola’s instructive opinion outlines Halliburton’s eDiscovery struggles to collect and process data from remote locations:

“Since the defendants employ persons overseas, this data collection may have to be shipped to the United States, or sent by network connections with finite capacity, which may require several days just to copy and transmit the data from a single custodian . . . (Halliburton) estimates that each custodian averages 15–20 gigabytes of data, and collection can take two to ten days per custodian. The data must then be processed to be rendered searchable by the review tool being used, a process that can overwhelm the computer’s capacity and require that the data be processed by batch, as opposed to all at once.”

Halliburton represented to the court that they spent hundreds of thousands of dollars on eDiscovery for only a few dozen remotely located custodians. The need to force-collect the remote custodians’ entire set of data and then sort it out through the expensive eDiscovery processing phase, instead of culling, filtering and searching the data at the point of collection drove up the costs.

Solving this collection challenge is X1 Distributed Discovery, which is specially designed to address the challenges presented by remote and distributed workforces.  X1 Distributed Discovery (X1DD) enables enterprises to quickly and easily search across up to thousands of distributed endpoints and data servers from a central location.  Legal and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, and full results with completed collection in hours, instead of days or weeks. The key to X1’s scalability is its unique ability to index and search data in place, thereby enabling a highly detailed and iterative search and analysis, and then only collecting data responsive to those steps. blog-relativity-collect-v3

X1DD operates on-demand where your data currently resides — on desktops, laptops, servers, or even the cloud — without disruption to business operations and without requiring extensive or complex hardware configurations. After indexing of systems has completed (typically a few hours to a day depending on data volumes), clients and their outside counsel or service provider may then:

  • Conduct Boolean and keyword searches of relevant custodial data sources for ESI, returning search results within minutes by custodian, file type and location.
  • Preview any document in-place, before collection, including any or all documents with search hits.
  • Remotely collect and export responsive ESI from each system directly into a Relativity® or RelativityOne® workspace for processing, analysis and review or any other processing or review platform via standard load file. Export text and metadata only or full native files.
  • Export responsive ESI directly into other analytics engines, e.g. Brainspace®, H5® or any other platform that accepts a standard load file.
  • Conduct iterative “search/analyze/export-into-Relativity” processes as frequently and as many times as desired.

To learn more about this capability purpose-built for remote eDiscovery collection and data audits, please contact us.

Leave a comment

Filed under Best Practices, Case Law, Case Study, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, Information Governance, Preservation & Collection, Relativity

Court Compels Forensic Imaging of Custodian Computer, Imposes Sanctions Due to Non-Defensible eDiscovery Preservation Process

By John Patzakis

HealthPlan Servs., Inc. v. Dixit, et al., 2019 WL 6910139 (M.D. Fla. Dec. 19, 2019), is an important eDiscovery case addressing what is required and expected from organizations to comply with electronic evidence discovery collection requirements. In this copyright infringement and breach of contract case, a Federal Magistrate Judge granted the plaintiff’s motion to compel immediate inspection of a defendant employee Feron Kutsomarkos’s laptop after the defendants failed to properly preserve and collect evidence from her. The Court granted plaintiff’s motion to compel the forensic examination, which set forth specific improprieties in their opponent’s ESI preservation process. The Court also granted the plaintiff’s motion for fees, sanctions, and a punitive jury instruction.

 

There are several key takeaways from this case. Here are the top 5:

  1. Custodian Self-Collection Is Not Defensible

Ms. Kutsomarkos conducted her own search of the emails rather than having an expert or trained IT or legal staff overseen by her attorney perform the search. The court found this process to not be defensible as the production “should have come from a professional search of the laptop” instead. This is yet another case disapproving of this faulty practice. For instance, another company found themselves on the wrong end of a $3 million sanctions penalty for spoliation of evidence because they improperly relied on custodians to search and collect Federal Court their own data. See GN Netcom, Inc. v. Plantronics, Inc., No. 12-1318-LPS, 2016 U.S. Dist. LEXIS 93299 (D. Del. July 12, 2016). Even with effective monitoring, severe defensibility concerns plague custodian self-collection, with several courts disapproving of the practice due to poor compliance and inconsistency of results. See Green v. Blitz, 2011 WL 806011, (E.D. Tex. Mar. 1, 2011), Nat’l Day Laborer Org. v. U.S. Immigration and Customs Enforcement Agency, 2012 WL 2878130 (S.D.N.Y. July 13, 2012).

  1. Producing Party Expected to Produce Their Own Data in a Defensible Manner

When responding to a litigation discovery request, the producing party is afforded the opportunity to produce their own data. However, the process must be defensible with a requisite degree of transparency and validation. When an organization does not have a systematic and repeatable process in place, the risks and costs associated with eDiscovery increase exponentially.  Good attorneys and the eDiscovery professionals who work with them will not only ensure their client complies with their own eDiscovery requirements, but will also scrutinize the opponent’s process and gain a critical advantage when the opponent fails to meet their obligations.

And that is what happened here. The corporate defendants had no real process other than telling key custodians to search and collect their own data. The eDiscovery-savvy plaintiff counsel filed motions poking large holes in the defendant’s process and won a likely case-deciding ruling. The stakes are high in such litigation matters and it is incumbent upon counsel to have a high degree of eDiscovery competence for both defensive and offensive purposes.

  1. Forensic Imaging is The Exception, Not the Rule

The court compelled the forensic imaging of a defendant’s laptop, but only as a punitive measure after determining bad faith non-compliance. Section 8c of The Sedona Principles, Third Edition: Best Practices, Recommendations & Principles for Addressing Electronic Document Production, provides that: “Forensic data collection requires intrusive access to desktop, server, laptop, or other hard drives or media storage devices.”  While noting the practice is acceptable in some limited circumstances, “making a forensic copy of computers is only the first step of an expensive, complex, and difficult process of data analysis . . . it should not be required unless circumstances specifically warrant the additional cost and burden and there is no less burdensome option available.”  The duty to preserve evidence, including ESI, extends only to relevant information. Parties that comply with discovery requirements will avoid burdensome and risk-laden forensic imaging.

  1. Metadata Must be Preserved

Metadata is required to be produced intact when designated by the requesting party, which is now commonplace. (See, Federal Rule of Civil Procedure 34(b)(1)(C)). Metadata is often relevant evidence itself and is also needed for accurate eDiscovery culling, processing and analysis. In her production, counsel for defendant Kutsomarkos provided pdf versions of documents from her laptop. However, the court found that “the pdf files scrubbed the metadata from the documents and that metadata should be available on the hard drives.” There are defensible and very cost effective ways to collect and preserve metadata. They were not used by the defendants, to their great detriment.

  1. A Defensible But Streamlined Process Is Optimal

HealthPlan Services, is yet another court decision underscoring the importance of a well-designed, cost-effective and defensible eDiscovery collection process. Such a capability is only attainable with the right enterprise technology. With X1 Distributed Discovery (X1DD), parties can perform targeted search and collection of the ESI of hundreds of endpoints over the internal network without disrupting operations. The search results are returned in minutes, not weeks, and thus can be highly granular and iterative, based upon multiple keywords, date ranges, file types, or other parameters. This approach typically reduces the eDiscovery collection and processing costs by at least one order of magnitude (90%), thereby bringing much needed feasibility to enterprise-wide eDiscovery collection that can save organizations millions while improving compliance by maintaining metadata, generating audit logs and establishing chain of custody.

And in line with concepts outlined in HealthPlan Services, X1DD provides a repeatable, verifiable and documented process for the requisite defensibility. For a demonstration or briefing on X1 Distributed Discovery, please contact us.

Leave a comment

Filed under Best Practices, Case Law, eDiscovery, Enterprise eDiscovery, ESI, Uncategorized

CaCPA Compliance Requires Effective Investigation and eDiscovery Capabilities

By John Patzakis

The California Consumer Protection Act, (CaCPA ), which will be in full force on January 1, 2020,  promises to profoundly impact major US and global organizations, requiring the overhaul of their data audit, investigation and information governance processes. The CaCPA requires that an organization have absolute knowledge of where all personal data of California residents is stored across the enterprise, and be able to remove it when required. Many organization with a global reach will be under obligations to comply with both the GDPR and CaCPA, providing ample requirement justification to bolster their compliance efforts.

CCPA Image

According to data security and privacy attorney Patrick Burke, who was recently a senior New York State Financial Regular overseeing cybersecurity compliance before heading up the data privacy law practice at Phillips Nizer, CaCPA compliance effectively requires a robust digital investigation capability. Burke, speaking in a webinar earlier this month, noted that under the “CaCPA, California residents can request that all data an enterprise holds on them be identified and also be removed. Organizations will be required to establish a capability to respond to such requests. Actual demonstrated compliance will require the ability to search across all data sources in the enterprise for data, including distributed unstructured data located on desktops and file servers.” Burke further noted that organizations must be prepared to produce “electronic evidence to the California AG, which must determine whether there was a violation of CaCPA…as well as evidence of non-violation (for private rights of action) and of a ‘cure’ to the violation.”

The CaCPA contains similar provisions as the GDPR, which both specify processes and capabilities organizations must have in place to ensure the personal data of EU and California residents is secure, accessible, and can be identified upon request. These common requirements, enumerated below, can only be complied with through an effective enterprise eDiscovery search capability:

  • Data minimization: Under both the CaCPA and the GDPR, enterprises should only collect and retain as little personal data on California residents EU subjects as possible. As an example, Patrick Burke, who routinely advises his legal clients on these regulations, notes that unauthorized “data stashes” maintained by employees on their distributed unstructured data sources is a key problem, requiring companies to search all endpoints to identify information including European phone numbers, European email address domains and other personal identifiable information.
  • Enforcement of right to be forgotten: An individual’s personal data must be identified and deleted on request.
  • Effective incident response: If there is a compromise of personal data, an organization must have the ability to perform enterprise-wide data searches to determine and report on the extent of such breaches and resulting data compromise within seventy-two (72) hours under the GDPR. There are less stringent, but similar CaCPA requirements.
  • Accountability: Log and provide audit trails for all personal data identification requests and remedial actions.
  • Enterprise-wide data audit: Identify the presence of personal data in all data locations and delete unneeded copies of personal data.

Overall, a core requirement of both CaCPA and GDPR compliance is the ability to demonstrate and prove that personal data is being protected, requiring information governance capabilities that allow companies to efficiently produce the documentation and other information necessary to respond to auditors’ requests. Many consultants and other advisors are helping companies establish privacy compliance programs, and are documenting policies and procedures that are being put in place.

However, while policies, procedures and documentation are important, such compliance programs are ultimately hollow without consistent, operational execution and enforcement. CIOs and legal and compliance executives often aspire to implement information governance programs like defensible deletion and data audits to detect risks and remediate non-compliance. However, without an actual and scalable technology platform to effectuate these goals, those aspirations remain just that. For instance, recent IDG research suggests that approximately 70% of information stored by companies is “dark data” that is in the form of unstructured, distributed data that can pose significant legal and operational risks.

To achieve GDPR and CaCPA compliance, organizations must ensure that explicit policies and procedures are in place for handling personal information, and just as important, the ability to prove that those policies and procedures are being followed and operationally enforced. What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, through the ability to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of days or weeks. The need for such an operational capability provided by best practices technology is further heightened by the urgency of CaCPA and GDPR compliance.

A link to the recording of the recent webinar “Effective Incident Response Under GDPR and CaCPA”, is available here.

 

Leave a comment

Filed under CaCPA, compliance, Data Audit, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, GDPR, Records Management, Uncategorized

How Case Teams Can Streamline Collections with X1 in RelativityOne

Editor’s’ Note: This article originally appeared on The Relativity Blog. It is reprinted here in full with permission. 

by Sam Bock on November 07, 2019

Our September 2019 release for RelativityOne debuted some game-changing functionality in the platform. Collect for RelativityOne enables fast, secure, and defensible collections right within the cloud, allowing RelativityOne users to pull data directly from Microsoft Office 365 without ever leaving the platform or Azure.

One of our developer partners—X1joined up with us on building this functionality, bringing their patented technology into Collect to help simplify traditionally complex workflows.

To get a better picture of just what Collect and X1 Distributed Discovery are capable of now that they’ve teamed up, we sat down with X1 Executive Chairman and Chief Legal Officer John Patzakis. Check out the most impactful takeaways from our conversation, and sign up for X1’s upcoming webinar to learn more.

Sam: What makes collection challenging for today’s legal teams?

John: Traditional e-discovery collection methods consist of either unsupervised custodian self-collection or manual services, driving up costs while increasing risk and disruption to business operations. On the other end of the spectrum, endpoint forensic imaging is burdensome, expensive, and not legally required for civil litigation discovery. Additionally, these manual and disjointed efforts are not technically integrated with Relativity, thus requiring multiple hand-offs, which increases risk, expense, and cumbersome project management efforts.

How does your team think creatively to tackle those challenges in the interest of conducting faster, more defensible collections for your customers?

We tackle collection from the enterprise and also enable significant scalability. X1 Distributed Discovery enables enterprises and their service providers to search, assess, and analyze electronically stored information (ESI) across hundreds or even thousands of custodians, enterprise-wide, where the data resides and before collection, with direct upload into Relativity. Instead of the expensive and disruptive “image then stage then process then load into review workspace” process, X1 Distributed Discovery allows for access to ESI where it sits within hours.

What sorts of variables exist in today’s collection workflows, and how does your team accommodate for those differences?

One of the biggest challenges with modern enterprise ESI collection comes from remote employees who only log into the network intermittently. Most network-enabled collection tools require custodians to be on the domain in order to work. However, X1 is architected to feature SSL security certificates—creating secure tunnels that enable collection from custodians wherever they are, including on WiFi in a Starbucks or on a plane.

Another key challenge is email collection. Traditional workflows often require collecting an entire PST email container or Exchange email account back to a central location for processing, identification, and preservation of potentially responsive email messages. This approach involves the transferring and processing of large files, which takes a lot of time, before even beginning to identify individually responsive email messages. Our solution eliminates the need to transfer entire email containers by allowing the identification and collection of individual messages in place on a custodian’s computer.

How is Collect for RelativityOne built to manage modern collections more effectively?

Collect integrates the X1 Distributed Discovery architecture to leverage patented search technology that indexes Microsoft Office 365 data directly on the laptop, desktop, or file server, allowing e-discovery, investigatory, or forensic professionals to globally query thousands of individual endpoints simultaneously. Individual emails and files can be identified by keyword, dates, and other metadata content without having to first retrieve the entire PST or ZIP across the network.

Collecting enterprise ESI can be one of the most daunting parts of the e-discovery process, and X1’s technical integration with RelativityOne seeks to make it less intimidating. The software helps streamline the e-discovery workflow by eliminating expensive and cumbersome processing steps and dramatically increasing speed to review. Collect for RelativityOne provides legal teams with a solution that compresses project timeframes; reduces risk by integrating collection with the rest of Relativity’s suite of features for review and analysis; and creating a repeatable process that helps reduce overall efforts and costs that might otherwise be spent outside of the platform. Additionally, the tight integration between X1’s technology and Relativity provides a unified chain of custody for optimal defensibility.

In short, we’re excited to see how this functionality, built into Relativity’s collection tool, can help revolutionize the current e-discovery process by collapsing the many hand-offs involved in the EDRM into a few short steps manageable by one or two people.

What tips and best practices would you share with a team conducting complex collections? How can they set themselves up for success from the start?

When collecting data, plan your collection criteria carefully. Focus on granular search criteria including file types, data ranges, and other key metadata in addition to detailed Boolean search terms to help your team strategically reduce collection volumes.

Sam Bock is a member of the marketing team at Relativity, and serves as editor of The Relativity Blog.

Leave a comment

Filed under collection, Corporations, eDiscovery, Enterprise eDiscovery, Uncategorized

In-Place Data Analytics For Unstructured Data is No Longer Science Fiction

By John Patzakis

AI-driven analytics supercharges compliance investigations, data security, privacy audits and eDiscovery document review.  AI machine learning employs mathematical models to assess enormous datasets and “learn” from feedback and exposure to gain deep insights into key information. This enables the identification of discrete and hidden patterns in millions of emails and other electronic files to categorize and cluster documents by concepts, content, or topic. This process goes beyond keyword searching to identify anomalies, internal threats, or other indicators of relevant behavior. The enormous volume and scope of corporate data being generated has created numerous opportunities for investigators seeking deep information insights in support of internal compliance, civil litigation and regulatory matters.

The most effective use of AI in investigations couple continuous active learning technology with concept clustering to discover the most relevant data in documents, emails, text and other sources.  As AI continues to learn and improve over time, the benefits of an effectively implemented approach will also increase. In-house and outside counsel and compliance teams are now relying on AI technology in response to government investigations, but also increasingly to identify risks before they escalate to that stage.

Stock Photo - Digital Image used in blog

However, logistical and cost barriers have traditionally stymied organizations from taking advantage of AI in a systematic and proactive basis, especially regarding unstructured data, which, according to industry studies, constitutes 80 percent or more of all data (and data risk) in the enterprise. As analytics engines ingest the text from documents and emails, the extracted text must be “mined” from their native originals. And the natives must first be collected and migrated to a centralized processing appliance. This arduous process is expensive and time consuming, particularly in the case of unstructured data, which must be collected from the “wild” and then migrated to a central location, creating a stand-alone “data lake.”

Due to these limitations, otherwise effective AI capabilities are utilized typically only on very large matters on a reactive basis that limits its benefits to the investigation at hand and the information within the captive data lake.  Thus, ongoing active learning is not generally applied across multiple matters or utilized proactively. And because that captive information consists of migrated copies of the originals, there is a very limited ability to act on data insights as the original data remains in its actual location in the enterprise.

So the ideal architecture for the enterprise would be to move the data analytics “upstream” where all the unstructured data resides, which would not only save up to millions per year in investigation, data audit and eDiscovery costs, but would enable proactive utilization for compliance auditing, security and policy breaches and internal fraud detection.  However, analytics engines require considerable computing resources, with the leading AI solutions typically necessitating tens of thousands of dollars’ worth of high end hardware for a single server instance. So these computing workloads simply cannot be forward deployed to laptops and multiple file servers, where the bulk of unstructured data and associated enterprise risk exists.

But an alternative architecture solves this problem. A process that extracts text from unstructured, distributed data in place, and systematically sends that data at a massive scale to the analytics platform, with the associated metadata and global unique identifiers for each item.  As mentioned, one of the many challenges with traditional workflows is the massive data transfer associated with ongoing data migration of electronic files and emails, the latter of which must be sent in whole containers such as PST files. This process alone can take weeks, choke network bandwidth and is highly disruptive to operations. However, the load associated with text/metadata only is less than 1 percent of the full native item. So the possibilities here are very compelling. This architecture enables very scalable and proactive compliance, information security, and information governance use cases. The upload to AI engines would take hours instead of weeks, enabling continual machine learning to improve processes and accuracy over time and enable immediate action to taken on identified threats or otherwise relevant information.

The only solution that we are aware of that fulfills this vision is X1 Distributed GRC. X1’s unique distributed architecture upends the traditional collection process by indexing at the distributed endpoints, enabling direct pipeline of extracted text to the analytics platform. This innovative technology and workflow results in far faster and more precise collections and a more informed strategy in any matter.

Deployed at each end point or centrally in virtualized environments, X1 Enterprise allows practitioners to query many thousands of devices simultaneously, utilize analytics before collecting and process while collecting directly into myriad different review and analytics applications like RelativityOne and Brainspace. X1 Enterprise empowers corporate eDiscovery, compliance, investigative, cybersecurity and privacy staff with the ability to find, analyze, collect and/or delete virtually any piece of unstructured user data wherever it resides instantly and iteratively, all in a legally defensible fashion.

X1 displayed these powerful capabilities with ComplianceDS in a recent webinar with a brief but substantive demo of our X1 Distributed GRC solution, emphasizing our innovative support of analytics engines through our game-changing ability to extract text in place with direct feed into AI solutions.

Here is a link to the recording with a direct link to the 5 minute demo portion.

Leave a comment

Filed under Best Practices, collection, compliance, Corporations, eDiscovery & Compliance, Enterprise eDiscovery, Enterprise Search, GDPR, Uncategorized