Category Archives: ECA

Remote ESI Collection and Data Audits in the Time of Social Distancing

By John Patzakis

The vital global effort to contain the COVID-19 pandemic will likely disrupt our lives and workflows for some time. While our personal and business lives will hopefully return to normal soon, the trend of an increasingly remote and distributed workforce is here to stay. This “new normal” will necessitate relying on the latest technology and updated workflows to comply with legal, privacy, and information governance requirements.

From an eDiscovery perspective, the legacy manual collection workflow involving travel, physical access and one-time mass collection of custodian laptops, file servers and email accounts is a non-starter under current travel ban and social distancing policies, and does not scale for the new era of remote and distributed workforces going forward. In addition to the public health constraints, manual collection efforts are expensive, disruptive and time-consuming as many times an “overkill” method of forensic image collection process is employed, thus substantially driving up eDiscovery costs.

When it comes to technical approaches, endpoint forensic crawling methods are now a non-starter. Network bandwidth constraints coupled with the requirement to migrate all endpoint data back to the forensic crawling tool renders the approach ineffective, especially with remote workers needing to VPN into a corporate network.  Right now, corporate network bandwidth is at a premium, and the last thing a company needs is their network shut down by inefficient remote forensic tools.

For example, with a forensic crawling tool, to search a custodian’s laptop with 10 gigabytes of email and documents, all 10 gigabytes must be copied and transmitted over the network, where it is then searched, all of which takes at least several hours per computer. So, most organizations choose to force collect all 10 gigabytes. The case of U.S. ex rel. McBride v. Halliburton Co.  272 F.R.D. 235 (2011), Illustrates this specific pain point well. In McBride, Magistrate Judge John Facciola’s instructive opinion outlines Halliburton’s eDiscovery struggles to collect and process data from remote locations:

“Since the defendants employ persons overseas, this data collection may have to be shipped to the United States, or sent by network connections with finite capacity, which may require several days just to copy and transmit the data from a single custodian . . . (Halliburton) estimates that each custodian averages 15–20 gigabytes of data, and collection can take two to ten days per custodian. The data must then be processed to be rendered searchable by the review tool being used, a process that can overwhelm the computer’s capacity and require that the data be processed by batch, as opposed to all at once.”

Halliburton represented to the court that they spent hundreds of thousands of dollars on eDiscovery for only a few dozen remotely located custodians. The need to force-collect the remote custodians’ entire set of data and then sort it out through the expensive eDiscovery processing phase, instead of culling, filtering and searching the data at the point of collection drove up the costs.

Solving this collection challenge is X1 Distributed Discovery, which is specially designed to address the challenges presented by remote and distributed workforces.  X1 Distributed Discovery (X1DD) enables enterprises to quickly and easily search across up to thousands of distributed endpoints and data servers from a central location.  Legal and compliance teams can easily perform unified complex searches across both unstructured content and metadata, obtaining statistical insight into the data in minutes, and full results with completed collection in hours, instead of days or weeks. The key to X1’s scalability is its unique ability to index and search data in place, thereby enabling a highly detailed and iterative search and analysis, and then only collecting data responsive to those steps. blog-relativity-collect-v3

X1DD operates on-demand where your data currently resides — on desktops, laptops, servers, or even the cloud — without disruption to business operations and without requiring extensive or complex hardware configurations. After indexing of systems has completed (typically a few hours to a day depending on data volumes), clients and their outside counsel or service provider may then:

  • Conduct Boolean and keyword searches of relevant custodial data sources for ESI, returning search results within minutes by custodian, file type and location.
  • Preview any document in-place, before collection, including any or all documents with search hits.
  • Remotely collect and export responsive ESI from each system directly into a Relativity® or RelativityOne® workspace for processing, analysis and review or any other processing or review platform via standard load file. Export text and metadata only or full native files.
  • Export responsive ESI directly into other analytics engines, e.g. Brainspace®, H5® or any other platform that accepts a standard load file.
  • Conduct iterative “search/analyze/export-into-Relativity” processes as frequently and as many times as desired.

To learn more about this capability purpose-built for remote eDiscovery collection and data audits, please contact us.

Leave a comment

Filed under Best Practices, Case Law, Case Study, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, Information Governance, Preservation & Collection, Relativity

X1 Announces Strategic Product Integration with Relativity

Today we are announcing some exciting news. Our X1 enterprise eDiscovery solution now integrates with Relativity, the industry leading e-discovery platform. X1 Insight & Collection, a component of the X1 Distributed Discovery platform, allows enterprises to search across and collect from up to thousands of custodians in hours, now with direct upload into Relativity, including RelativityOne, utilizing Relativity’s import APIs.

The X1 and Relativity integration addresses several pain points in the existing e-discovery process. For one, there is currently an inability to quickly search across all unstructured data, meaning users have to spend the weeks or even months that are required by other cumbersome solutions. Additionally, using ESI processing methods that involve appliances that are not integrated with the collection significantly increase cost and time delays. And with such an  inefficient process there is simply no way for attorneys and legal professionals to gain immediate visibility into data, often leaving them to wait weeks before they have a chance to assess the data, post- collection.

The X1/Relativity integration directly addresses these challenges. Among the substantial benefits of this integration is the dramatic increase in speed to review, flowing directly from the custodian into Relativity on-premise or into the cloud-based RelativityOne platform. And this integration significantly reduces or completely eliminates inefficient ESI processing. X1 will search, cull and de-duplicate data at the point of collection and now integrates with the Relativity ingestion API, rendering inefficient and expensive processing appliances obsolete.

Organizations will be given real time early case assessment within minutes of initial search instead of taking days and weeks for this insight.  All of this is achieved with a truly repeatable end-to-end process for enterprises. The combination of X1 and Relativity provides a full and complete e-discovery platform.

“Collecting enterprise ESI can be one of the most daunting parts of the e-discovery process,” said Drew Deitch, senior manager for strategic partnerships at Relativity. “We’re excited to bring X1 into the App Hub, where it will offer users another great way to access, search, process, and import enterprise data into Relativity.”

Finally, with this integration providing a complete platform for efficient data search, discovery and review across the enterprise, this also enables organizations to very effectively address numerous information governance use cases such as GDPR compliance, identifying and removing PII and conducting IP data audits.

To see X1 in action, we have a 7-minute demonstration video including this integration with Relativity available here.

Leave a comment

Filed under Best Practices, ECA, eDiscovery, eDiscovery & Compliance, Information Governance, Preservation & Collection, Uncategorized

Key to Improving Predictive Coding Results: Effective ECA

Predictive Coding, when correctly employed, can significantly reduce legal review costs with generally more accurate results than other traditional legal review processes. However, the benefits associated with predictive coding are often undercut by the over-collection and over-inclusion of Electronically Stored Information (ESI) into the predictive coding process. This is problematic for two reasons.

The first reason is obvious, the more data introduced into the process, the higher the cost and burden. Some practitioners believe it is necessary to over-collect and subsequently over-include ESI to allow the predictive coding process to sort everything out. Many service providers charge by volume, so there can be economic incentives that conflict with what is best for the end-client. In some cases, the significant cost savings realized through predictive coding are erased by eDiscovery costs associated with overly aggressive ESI inclusion on the front end.

The second reason why ESI over-inclusion is detrimental is less obvious, and in fact counter intuitive to many. Some discovery practitioners believe as much data as possible needs to be put through the predictive coding process in order to “better train” the machine learning algorithms. However this is contrary to what is actually true. The predictive coding process is much more effective when the initial set of data has a higher richness (also referred to as “prevalence”) ratio. In other words, the higher the rate of responsive data in the initial data set, the better. It has always been understood that document culling is very important to successful, economical document review, and that includes predictive coding.

Robert Keeling, a senior partner at Sidley Austin and the co-chair of the firm’s eDiscovery Task Force, is a widely recognized legal expert in the areas of predictive coding and technology assisted review.  At Legal Tech New York earlier this year, he presented at an Emerging Technology Session: “Predictive Coding: Deconstructing the Secret Sauce,” where he and his colleagues reported on a comprehensive study of various technical parameters that affect the outcome of a predictive coding effort.  According to Robert, the study revealed many important findings, one of them being that a data set with a relatively high richness ratio prior to being ingested into the predictive coding process was an important success factor.

To be sure, the volume of ESI is growing exponentially and will only continue to do so. The costs associated with collecting, processing, reviewing, and producing documents in litigation are the source of considerable pain for litigants. The only way to reduce that pain to its minimum is to use all tools available in all appropriate circumstances within the bounds of reasonableness and proportionality to control the volumes of data that enter the discovery pipeline, including predictive coding.

Ideally, an effective early case assessment (ECA) capability can enable counsel to set reasonable discovery limits and ultimately process, host, review and produce less ESI.  Counsel can further use ECA to gather key information, develop a litigation budget, and better manage litigation deadlines. ECA also can foster cooperation and proportionality in discovery by informing the parties early in the process about where relevant ESI is located and what ESI is significant to the case. And with such benefits also comes a much more improved predictive coding process.

X1 Distributed Discovery (X1DD) uniquely fulfills this requirement with its ability to perform pre-collection early case assessment, instead of ECA after the costly, time consuming and disruptive collection phase, thereby providing a game-changing new approach to the traditional eDiscovery model.  X1DD enables enterprises to quickly and easily search across thousands of distributed endpoints from a central location.  This allows organizations to easily perform unified complex searches across content, metadata, or both and obtain full results in minutes, enabling true pre-collection ECA with live keyword analysis and distributed processing and collection in parallel at the custodian level. To be sure, this dramatically shortens the identification/collection process by weeks if not months, curtails processing and review costs from not over-collecting data, and provides confidence to the legal team with a highly transparent, consistent and systemized process. And now we know of another key benefit of an effective ECA process: much more accurate predictive coding.

Leave a comment

Filed under ECA, eDiscovery