Category Archives: GDPR

De-NISTing in eDiscovery: A Costly Provision That Shouldn’t Be in Model Orders in the First Place

By John Patzakis

A model eDiscovery order I recently came across from a federal district court issued by a respected judge included a provision requiring parties to de-NIST their files in the course of eDiscovery production. On its face, this may seem like a reasonable technical requirement to some practitioners. But this provision reflects a fundamental misunderstanding of how proportional, targeted eDiscovery collection should work — and it points to a broader problem in our industry that deserves some attention.

For those unfamiliar with the term, de-NISTing refers to the process of filtering out known, irrelevant system files from a forensic collection using the National Institute of Standards and Technology’s reference database of known file signatures. The NIST database catalogs hundreds of thousands of known operating system files, executables, DLL files, and other system-generated data that have no evidentiary value whatsoever. De-NISTing removes these files from a collection so that reviewers are not burdened with wading through mountains of irrelevant system data. The reason you need to de-NIST in the first place is because you collected a full-disk image — capturing everything on the drive, relevant or not.

And that is precisely the problem with requiring de-NISTing in a model eDiscovery order. As I have written extensively, including in our recent white paper on proportionality in eDiscovery, courts have consistently held that full-disk imaging is not the appropriate default for civil litigation collections. Going all the way back to Deipenhorst v. City of Battle Creek in 2006, courts have warned that imaging a hard drive results in the production of massive amounts of irrelevant — and potentially privileged — information. More recently, in Motorola Solutions v. Hytera Communications Corp., the court emphasized that forensic examination of a party’s computers “is no routine matter” and that courts must use caution to avoid unduly impinging on privacy interests. A model order that presupposes full-disk imaging by requiring de-NISTing is, at minimum, inconsistent with this well-established body of case law.

The 2015 amendments to Federal Rule of Civil Procedure 26(b)(1) established a clear six-pronged proportionality framework for eDiscovery, requiring parties and courts to weigh factors including the importance of the issues at stake, the amount in controversy, the parties’ resources, and whether the burden or expense of proposed discovery outweighs its likely benefits. Courts have taken these amendments seriously and have consistently limited overbroad discovery requests on proportionality grounds. A blanket model order requirement to de-NIST implicitly endorses a collect-everything methodology that runs counter to the proportionality principles embedded in Rule 26(b)(1) and the extensive case law that has developed around it.

So how does a provision like this end up in a model court order? The answer, I believe, lies in the undue influence that certain eDiscovery service providers have had on collection practices and, ultimately, on the drafting of court orders and guidelines. Some service providers have a clear financial incentive to collect as much data as possible, since their fees are calculated on a per-gigabyte basis — meaning the more data collected, processed, and hosted, the higher the bill. This volume-based business model has shaped industry “best practices” in ways that favor over-collection, and that mindset has quietly seeped into the thinking of some federal judges and the model orders they issue. What gets dressed up as technical diligence is, in many cases, simply an artifact of a business model that profits from excess.

If you are conducting a properly scoped, targeted eDiscovery collection that is consistent with the principles of proportionality — as the Federal Rules and overwhelming case law require — there is simply no reason to de-NIST. A targeted collection does not reach system files, executables, DLLs, or other non-user-generated data in the first place. You are collecting potentially relevant ESI from identified custodians, scoped by search terms, date ranges, file types, and data sources. You never touch the data that de-NISTing is designed to filter out, which means the entire de-NISTing step — and its associated cost and processing time — is unnecessary overhead born entirely of an overbroad collection methodology.

This is precisely the approach built into X1 Enterprise, which enables legal and IT teams to conduct targeted, remote collections across large numbers of custodians without ever capturing the system-level data that necessitates de-NISTing. X1 Enterprise collects only the user-generated, potentially relevant ESI within defined parameters, preserving full metadata integrity and maintaining a documented chain of custody — satisfying every requirement for forensic soundness without the bloat, expense, and proportionality concerns of full-disk imaging. In an era where courts are increasingly scrutinizing eDiscovery costs and demanding proportionality, practitioners and judges alike should be asking not how to manage the mess created by over-collection, but how to avoid creating that mess in the first place.

Leave a comment

Filed under Best Practices, Case Law, Cloud Data, Cybersecurity, Data Audit, Data Governance, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, GDPR, Information Governance, Information Management

Navigating Legal and Compliance Risks When Corporations Expose Sensitive Data to AI

By Kelly Twigger and John Patzakis

Implementing AI within a corporate environment is no longer a matter of “if” but “how.” We recently addressed these challenges in our webinar, “Navigating Legal and Compliance Risks in AI,” where our panel of experts discussed the strategic transition required to build a robust risk mitigation framework. While the efficiency gains of AI—such as automating workflows and surfacing deep insights—are compelling, introducing sensitive enterprise data into these models without a tactical plan can lead to unintended consequences. These risks range from the dilution of trade secrets to complex eDiscovery obligations and substantial regulatory exposure under the GDPR.

To leverage AI safely, counsel should focus on the following grounded strategies for risk management.

Protect Trade Secrets
Under federal law, trade secret status is contingent upon the owner taking “reasonable measures” to maintain secrecy. This is a rigorous standard; if proprietary information—such as source code or high-value technical data—is fed into an unsecured AI model without strict access controls, a company risks losing its legal protections entirely.

  • Review the Judicial Standard: In Snyder v. Beam Technologies, Inc., the 10th Circuit affirmed that failing to use confidentiality protections or allowing information to reside on unsecured devices can defeat trade secret status.
  • Maintain Active Safeguards: Courts emphasize that consistent and active safeguards are required to maintain secrecy. Lax internal controls during AI interactions can be cited as evidence that “reasonable measures” were not maintained.
  • Implement No-Prompt Zones: Establish “No-Prompt Zones” for your organization’s most sensitive intellectual property. By isolating core IP from third-party cloud models, you maintain a defensible record of “reasonable measures” that can withstand scrutiny in litigation.

Manage the eDiscovery Paper Trail
AI interactions—both the prompts submitted by employees and the responses generated by the tools—are considered discoverable Electronically Stored Information (ESI). These records are part of the corporate record and are subject to subpoena and legal holds.

  • Understand the Technical Reality: Microsoft has confirmed that Microsoft 365 Copilot interactions are logged through the Purview unified audit log, making them searchable, preservable, and producible via eDiscovery tools.
  • Assess Scope of Exposure: Because these chats are treated no differently than emails, they may inadvertently expose privileged or damaging material if not managed properly.
  • Map Information Logs: Update your legal hold workflows to specifically include AI conversation logs and audit trails. Mapping where these logs live before litigation arises ensures a more controlled and cost-effective discovery process.

Navigate GDPR and Data Privacy
Processing customer or employee data through AI models requires strict adherence to the GDPR principles of data minimization, purpose limitation, and lawfulness. Feeding sensitive data into AI models without a clearly articulated lawful basis—such as consent or legitimate interest—can result in significant administrative fines.

  • Meet Compliance Requirements: European authorities require organizations to demonstrate compliance by documenting purposes, limiting data inputs, and ensuring appropriate safeguards are in place.
  • Identify Special Categories: The GDPR is particularly restrictive regarding health information or data revealing racial or ethnic origin, requiring specific exemptions for processing.
  • Conduct Privacy Impact Assessments: Perform mandatory Privacy Impact Assessments (PIAs) for any AI tool that touches personal data. Documenting the purpose and necessity of the processing is critical for maintaining regulatory standing during an audit.

Leverage In-Place AI Functionality
A critical strategy for reducing risk is shifting where the AI processing occurs. Rather than routing data through external, third-party cloud-hosted AI services, organizations should consider prioritizing workflows where AI is applied in-place within the corporate network or controlled enterprise environment.

  • Secure the Data Perimeter: By keeping data and AI processing behind the organization’s own security firewall, you materially reduce the risk of trade secret leakage and data exfiltration.
  • Minimize Third-Party Footprint: Applying AI in-place narrows the scope of discoverable third-party records, as the interactions remain within your internal infrastructure rather than residing on a vendor’s servers.
  • Establish Full Governance Control: This model provides counsel with direct control over privacy, retention, and audit obligations—essentially giving you the “kill switch” for data that you simply do not have with external cloud vendors.

Tactical Governance and Ethical Oversight
Counsel must navigate the professional and technical nuances of AI deployment to ensure long-term stability.

  • Ensure Professional Competence: The ethical duty of technological competence requires attorneys to understand the limitations of the tools they use. AI should be treated as a “junior associate”—capable of great speed but requiring diligent human verification of all output.
  • Apply Risk-Based Tiering: Not all AI use cases carry the same weight. We recommend a tiered approach:
    o Tier 1 (Administrative): Low-risk tasks involving non-sensitive data.
    o Tier 2 (Internal/Marketing): Standard communications requiring routine oversight.
    o Tier 3 (High-Value/Restricted): High-stakes processing involving PII, health data, or proprietary IP, requiring senior legal sign-off and strict data handling protocols.
  • Execute Proactive Vendor Vetting: Move from consumer-grade tools to enterprise solutions that offer SOC 2 Type 2 attestations. Ensure contracts explicitly prohibit the vendor from using your data to train their global models.

In light of these risks, corporate counsel should take a proactive, structured approach to AI governance. This includes implementing data classification and usage controls to prevent sensitive trade secrets from being exposed to AI systems without safeguards; establishing clear policies governing AI prompts, outputs, retention, and eDiscovery treatment; and conducting privacy impact assessments to ensure personal data processing complies with GDPR and similar regulations. In addition, counsel should carefully evaluate AI deployment models and consider workflows in which AI models are deployed in-place within the corporate network or controlled enterprise environment, rather than routed through third-party cloud-hosted AI services. Keeping data and AI processing inside the organization’s security perimeter can materially reduce trade secret leakage risk, narrow the scope of discoverable third-party records, and provide greater control over privacy, retention, and audit obligations—while still allowing the enterprise to realize the benefits of advanced AI capabilities.

For a deeper dive into these strategies and more case studies, you can watch the full session here.

1 Comment

Filed under Best Practices, compliance, Corporations, Cybersecurity, Data Governance, ECA, eDiscovery & Compliance, Enterprise AI, Enterprise eDiscovery, ESI, GDPR, Information Governance, Records Management

Why Most eDiscovery Tools and Online Archiving Offerings Are Terrible for Information Governance

By John Patzakis and Chas Meier

Many organizations assume that information governance initiatives—such as data privacy audits, purging ROT (Redundant, Obsolete, or Trivial) data, merger and acquisition-driven data separation, or data breach impact assessments—can be effectively addressed using eDiscovery tools or online archiving platforms. After all, eDiscovery solutions excel at identifying and searching through large volumes of unstructured data in high-stakes, reactive legal scenarios.

However, there is a critical distinction between eDiscovery and information governance workflows that organizations must understand when selecting the right solution. eDiscovery typically involves copying large volumes of data at multiple stages and continually moving that data upstream, eventually into third-party cloud platforms for processing and hosting. In contrast, duplicating and moving massive data sets is often the last thing you want to do in information governance projects, which are typically large-scale, enterprise-wide initiatives.

In fact, here are five major reasons why most eDiscovery tools and online archiving solutions are terrible for information governance. These tools:

  1. Dramatically Increase Risk
    Consider a scenario where an organization suffers a data breach and must assess 100 terabytes of data to identify compromised PII and determine reporting obligations. Most eDiscovery tools require a full copy of this data to be made and uploaded into a third-party environment—doubling the volume of sensitive material and compounding the risk. Instead of helping, this kind of mass data duplication exacerbates the compliance and privacy risks that governance initiatives aim to reduce. In fact, such inefficient data duplication directly conflicts with GDPR principles, which require data minimalization and proportionality.
  2. Are Exorbitantly Expensive
    Information governance is not a small, tactical effort—it is a broad, enterprise-wide initiative. At X1, we rarely see governance projects involving less than 50 terabytes of data. Using traditional eDiscovery pricing models, even with volume-based discounts, these projects can quickly rack up tens of millions of dollars in costs due to unnecessary processing, storage, and hosting workflows designed for litigation—not governance.
  3. Can’t Meet Time Constraints
    Copying, transferring, uploading, and indexing 100 terabytes of data into a third-party cloud platform can easily take six months or more, even in an ideal scenario. That timeline is incompatible with the urgent nature of most information governance use cases, such as data breach impact assessments or M&A-related audits. Worse yet, by the time the data has been copied and indexed, it will likely already be stale—undermining the integrity of the project from the outset.
  4. Create Remediation Roadblocks
    Suppose you incur the costs and risk to copy and upload a full data set in an external review platform and successfully identify sensitive or outdated data for remediation. Now what? You are merely working with copies of the data. The originals remain distributed across Microsoft 365, file servers, laptops, and other locations. Trying to trace back and manually remediate live data sources is costly, disruptive, and error-prone—defeating the very efficiency goals of the governance project.
  5. Do not Support Microsoft 365 Effectively
    Many so-called “governance” tools are simply rebranded email archiving systems that rely on bulk copying data out of Microsoft 365. Not only is this approach expensive and inefficient, but it also creates serious technical and compliance risks. Microsoft 365 does not support mass data exports at scale without significant friction, and errors are common—as illustrated in FTC v. Match Group, No. 3:19-CV-2281-K, 2025 WL 46024 (N.D. Tex. Jan. 7, 2025). In that case, Microsoft Purview exports into an archival system failed, resulting in court-imposed discovery sanctions. If a solution does not support index-in-place capabilities—allowing analysis directly upon the native data—it is simply not viable for modern information governance needs.

A Different Approach is Required
Information governance requires agility, precision, and a fundamentally different approach than traditional eDiscovery processes. Organizations must be wary of legacy eDiscovery tools and outdated archiving platforms masquerading as governance solutions.

X1 Enterprise was purpose-built to address the challenges and inefficiencies that plague traditional eDiscovery tools and archiving platforms when applied to information governance. At the core of the X1 Enterprise Platform is its patented micro-indexing architecture, which enables organizations to search, analyze, and act on data in place, without needing to first copy, move, or centralize it.

This index-in-place capability means X1 can connect directly to endpoints, file shares, Microsoft 365, and other enterprise data sources to perform fast, scalable, and highly targeted data sweeps and analysis—without duplicating the data or exposing it to unnecessary risk. Whether you are performing a data privacy audit, a breach impact assessment, or an M&A data separation project, you can run real-time searches across tens of terabytes and thousands of custodians—with results returned in minutes, not months, and the data remediation performed in-place.

By eliminating the need for data movement, X1 avoids the five major pitfalls of legacy tools:
Risk: No mass duplication of data, reducing exposure and aligning with GDPR and other regulatory requirements.
Cost: No massive ingestion or hosting fees—X1 dramatically lowers total project costs by working directly with live data.
Time: Deploy and execute governance initiatives in a fraction of the time required by traditional methods.
Remediation: Act directly on live data—flag it, move it, delete it, or apply tags—in the original source locations.
Microsoft 365 Compatibility: X1 integrates natively with Microsoft 365 and other systems without requiring cumbersome exports or expensive additional licensing and services, enabling robust, reliable governance at enterprise scale. Simply put, we believe X1 provides the best available support for M365 data sources.

In short, X1 Enterprise offers a faster, safer, and far more cost-effective way to execute complex information governance projects—turning what used to be massive, reactive, months-long efforts into streamlined, proactive, and strategic workflows.

Learn more about how X1 Enterprise can streamline your next information governance project. Schedule a demo today at sales@x1.com or visit www.x1.com/solutions/x1-enterprise-platform.

Leave a comment

Filed under Best Practices, CaCPA, Cloud Data, Corporations, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, GDPR, Information Governance, law firm, m365, Preservation & Collection, Records Management

X1 Achieves Record Growth as Numerous Fortune 500 Companies Standardize on X1 Enterprise

By Larry Gill

X1 Discovery is having a record-breaking year, with dozens of Fortune 500 companies and leading law firms adopting the X1 Enterprise Platform to transform how they approach eDiscovery collection, early case assessment, and information governance. In an era when overcollection and skyrocketing legal costs strain corporate budgets, these organizations are choosing X1 to gain immediate insight into their data, dramatically reduce costs, and ensure defensible, repeatable processes—all while maintaining complete control over their information. This surge in adoption reflects X1’s position as the industry’s trusted solution for modern, efficient, and targeted enterprise eDiscovery.

The X1 Enterprise Platform is an industry-leading eDiscovery and information governance solution that empowers organizations to search, identify, analyze, and act on their data in-place, wherever it resides. X1 uniquely addresses Microsoft 365—including robust Teams support—laptops, file servers, and other cloud and on-premises sources, giving legal and compliance teams unparalleled reach and control. Dozens of major enterprises and AM Law 100 firms have now standardized on X1, recognizing it as the most effective solution for managing M365 content—often outperforming even Purview Premium—while also covering on-premises data sources seamlessly. By enabling a highly targeted, efficient, index-in-place approach, X1 provides immediate, pre-collection visibility, streamlining search, analysis, remediation, and collection workflows like never before.

Here are the top three reasons why leading organizations are adopting X1 Enterprise in record numbers:

  1. Significant Return on Investment
    Corporate legal departments that implement X1 consistently realize up to 90% in “hard” cost savings. X1’s powerful in-place search and pre-collection filtering enable teams to collect only what is needed, achieve true proportionality, and eliminate massive outsourced processing and project management fees. Many organizations are even scaling back or eliminating costly Purview Premium licenses altogether, all while mitigating risk with a defensible and repeatable collection process.
  2. Unmatched Speed and Scalability
    X1 delivers speed and scalability that no other solution can match. It can search across thousands of laptops and multiple terabytes of M365 or file share data within minutes, quickly pinpointing responsive data for precise collection or remediation. All indexed data stays securely behind the corporate firewall or in a private cloud. Unlike legacy tools that overpromise and underdeliver, X1 is proven to work and scale as advertised, backed by real-world case studies and customer success stories.
  3. Multiple Use Cases Beyond eDiscovery
    Beyond eDiscovery, corporate legal and compliance teams leverage X1 to locate and remediate sensitive personal information (PII), defensibly purge redundant or non-compliant data, support due diligence and data separation during M&A transactions, and handle GDPR Data Subject Access Requests (DSARs) and other data privacy obligations—making X1 a true multipurpose platform for enterprise information governance.

In today’s data-driven world, X1 Enterprise is more than a solution—it’s a strategic advantage. For organizations serious about controlling eDiscovery costs, reducing risk, and gaining immediate insight into their data, X1 is the clear choice.

Interested in learning more about how to dramatically reduce your costs and compliance risks? Schedule a briefing today at sales@x1.com or visit www.x1.com/solutions/x1-enterprise-platform.

Leave a comment

Filed under Authentication, Cloud Data, Corporations, Cybersecurity, Data Audit, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, GDPR, Information Access, Information Governance, m365, MS Teams, OneDrive, Preservation & Collection, SharePoint

Courts Favor Targeted eDiscovery Collections, but It Is Up to In-House Teams to Enable Such Cost Saving Proportional Efforts

By John Patzakis

In-House Legal Teams Enable Cost Savings

Corporate legal departments face ever-increasing costs and risk related to eDiscovery, driven largely by excessive and indiscriminate data collection. Many organizations default to an overbroad “collect everything” approach out of an abundance of caution or due to inefficient workflows imposed by third-party service providers or even outside counsel. Over collection results in far higher costs upstream, critical delays and increased risk. However, for this reason courts consistently endorse proportional and targeted discovery practices that balance the needs of litigation with cost-effectiveness and reasonableness. But in order to best realize the benefits of proportionality, organizations should establish an in-house eDiscovery capability supported by best-practices technology.

Courts Support Proportional and Targeted ESI Collection
The Federal Rules of Civil Procedure (FRCP) emphasize proportionality and reasonableness in discovery. Specifically, Rule 26(b)(1) limits discovery to information that is relevant to any party’s claim or defense and proportional to the needs of the case.

Courts have routinely upheld this principle, encouraging parties to avoid overbroad collections:

  1. The Sedona Conference Principles
    While not binding, courts frequently rely on The Sedona Principles, which advocate for “reasonable and good faith efforts” to identify relevant ESI. (See The Sedona Principles, Third Edition, 19 Sedona Conf. J. 1 (2018)). Courts cite these principles to support reasonable limits on preservation and collection.
  2. In re Bard IVC Filters Prods. Liab. Litig., 317 F.R.D. 562 (D. Ariz. 2016)
    Here, the court recognized the proportionality limits of Rule 26(b)(1) and ruled that the defendant’s proposed targeted discovery approach—using custodians, date ranges, and agreed-upon search terms—satisfied its obligations.
  3. Oxbow Carbon & Minerals LLC v. Union Pacific Railroad Co., 322 F.R.D. 1 (D.D.C. 2017)
    The court rejected broad discovery requests that lacked proportionality, holding that the producing party could limit its search for ESI to agreed-upon custodians and relevant date ranges. The court emphasized that broad, burdensome demands are contrary to Rule 26(b)(1).
  4. Hernandez v. City of Houston, No. 4:16-CV-3577, 2020 WL 2542625 (S.D. Tex. May 19, 2020)
    Here, the court denied a motion to compel additional production of ESI beyond agreed search terms, explaining that the requested expansion was disproportionate given the marginal relevance and substantial burden of additional collection.

These and other decisions (further analysis available here) demonstrate that targeted, proportional collection efforts are not only defensible but expected by the courts. Overcollection is hardly mandated by the court and, in fact, can increase risk by preserving irrelevant or privileged information unnecessarily.

So, the problem is not the law. The challenge is that many eDiscovery service providers favor full disk imaging or other forms of massive data over-collection for two reasons: 1) As they are not integrated into a company’s IT data architecture with an established and repeatable process, they revert to a reactive, once-off effort to collect everything that could possibly be relevant; and 2) They are financially incentivized to collect as much data as possible.

Advantages of In-House eDiscovery Capabilities for Targeted Collections
To align with the principles of proportionality, legal departments should move away from the outsourced collection model that favors bulk extraction. Instead, maintaining an in-house eDiscovery capability provides the following key advantages:

  1. Integrated, Precise Search and Collection
    Solutions like X1 Enterprise are designed to index data in place, allowing corporate legal and IT teams to search, cull, and collect only what is relevant—without moving massive volumes of unnecessary data. This reduces costs and minimizes data exposure.
  2. Iterative, Defensible Process
    With in-house capabilities, legal teams can collaborate directly with IT to conduct collections iteratively. They can refine search criteria and custodians in real-time, in response to case developments or meet-and-confer negotiations, ensuring defensibility and responsiveness.
  3. Faster Response Times and Lower Costs
    Deeply integrated technology removes reliance on expensive, reactive third-party vendors who often require full data exports up front. By indexing data where it resides, in-house teams can respond quickly to litigation holds and discovery deadlines.
  4. Enhanced Compliance and Risk Management
    By avoiding massive data dumps, corporations reduce the risk of producing irrelevant, privileged, or sensitive data unnecessarily. Proportionality helps mitigate privacy risks and comply with data minimization principles under privacy laws like the GDPR and CCPA.
  5. Control and Repeatability Across Multiple Use Cases
    In-house solutions preserve institutional knowledge and workflows. Future cases can reuse workflows and search parameters, creating repeatable, consistent, and auditable processes. Further, the same process can be readily leveraged for various information governance and other compliance use cases.

Conclusion
Courts expect discovery to be proportional, targeted, and reasonable—not excessive or indiscriminate. Establishing an in-house eDiscovery capability with proven integrated technology like X1 Enterprise allows your organization to operationalize this legal standard. By doing so, you will reduce costs, minimize risks, and demonstrate good faith compliance with discovery obligations.

Leave a comment

Filed under Best Practices, CaCPA, Cloud Data, Corporations, ECA, eDiscovery, eDiscovery & Compliance, Enterprise eDiscovery, ESI, GDPR, m365, Preservation & Collection, proportionality