Category Archives: Corporations

CCPA and GDPR UPDATE: Unstructured Enterprise Data in Scope of Compliance Requirements

An earlier version of this article appeared on Legaltech News

By John Patzakis

A core requirement of both the GDPR and the similar California Consumer Privacy Act (CCPA), which becomes enforceable on July 1, is the ability to demonstrate and prove that personal data is being protected. This requires information governance capabilities that allow companies to efficiently identify and remediate personal data of EU and California residents. For instance, the UK Information Commissioner’s Office (ICO) provides that “The GDPR places a high expectation on you to provide information in response to a SAR (Subject Access Request). Whilst it may be challenging, you should make extensive efforts to find and retrieve the requested information.”CCPA GDPR

However, recent Gartner research notes that approximately 80% of information stored by companies is “dark data” that is in the form of unstructured, distributed data that can pose significant legal and operational risks. With much of the global workforce now working remotely, this is of special concern and nearly all the company data maintained and utilized by remote employees is in the form of unstructured data. Unstructured enterprise data generally refers to searchable data such as emails, spreadsheets and documents on laptops, file servers, and social media.

The GDPR

An organization’s GDPR compliance efforts need to address any personal data contained within unstructured electronic data throughout the enterprise, as well as the structured data found in CRM, ERP and various centralized records management systems. Personal data is defined in the GDPR as: “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”

Under the GDPR, there is no distinction between structured versus unstructured electronic data in terms of the regulation’s scope. There is a separate guidance regarding “structured” paper records (more on that below). The key consideration is whether a data controller or processor has control over personal data, regardless of where it is located in the organization. Nonetheless, there is some confusion about the scope of the GDPR’s coverage across structured as well as unstructured electronic data systems.

The UK ICO is a key government regulator that interprets and enforces the GDPR, and has recently issued important draft guidance on the scope of GDPR data subject access rights, including as it relates to unstructured electronic information. Notably, the ICO notes that large data sets, including data analytics outputs and unstructured data volumes, “could make it more difficult for you to meet your obligations under the right of access. However, these are not classed as exemptions, and are not excuses for you to disregard those obligations.”

Additionally the ICO guidance advises that “emails stored on your computer are a form of electronic record to which the general principles (under the GDPR) apply.” In fact, the ICO notes that home computers and personal email accounts of employees are subject to GDPR if they contain personal data originating from the employers networks or processing activities. This is especially notable under the new normal of social distancing, where much of a company’s data (and associated personal information) is being stored on remote employee laptops.

The ICO also provides guidance on several related subjects that shed light on its stance regarding unstructured data:

Archived Data: According to the ICO, data stored in electronic archives is generally subject to the GDPR, noting that there is no “technology exemption” from the right of access. Enterprises “should have procedures in place to find and retrieve personal data that has been electronically archived or backed up.” Further, enterprises “should use the same effort to find information to respond to a SAR as you would to find archived or backed-up data for your own purposes.”

Deleted Data: The ICO’s view on deleted data is that it is generally within the scope of GDPR compliance, provided that there is no intent to, or a systematic ability to readily recover that data. The ICO says it “will not seek to take enforcement action against an organisation that has failed to use extreme measures to recreate previously ‘deleted’ personal data held in electronic form. We do not require organisations to use time and effort reconstituting information that they have deleted as part of their general records management.”

However, under this guidance organizations that invest in and deploy re-purposed computer forensic tools that feature automated un-delete capabilities may be held to a higher standard. Deploying such systems can reflect intent to as well as having the systematic technical ability to recover deleted data.

Paper Records: Paper records that are part of a “structured filing system” are subject to the GDPR. Specifically, if an enterprise holds “information about the requester in non-electronic form (e.g. in paper files or on microfiche records)” then such hard-copy records are considered personal data accessible via the right of access,” if such records are “held in a ‘filing system.” This segment of the guidance reflects that references to “unstructured data” in European parlance usually pertains to paper records. The ICO notes in separate guidance that “the manual processing of unstructured personal data, such as unfiled handwritten notes on paper” are outside the scope of GDPR.

GDPR Article 4 defines a “filing system” as meaning “any structured set of personal data which are accessible according to specific criteria, whether centralized, decentralized or dispersed on a functional or geographical basis.” The only form of “unstructured data” that would not be subject to GDPR would be unfiled paper records like handwritten notes or legacy microfiche.

The CCPA  

The California Attorney General (AG) released a second and presumably final round of draft regulations under the California Consumer Privacy Act (CCPA) that reflect how unstructured electronic data will be treated under the Act. The proposed rules outline how the California AG is interpreting and will be enforcing the CCPA. Under § 999.313(d)(2), data from archived or backup systems are—unlike the GDPR—exempt from the CCPA’s scope, unless those archives are restored and become active. Additional guidance from the Attorney General states: “Allowing businesses to delete the consumer’s personal information on archived or backup systems at the time that they are accessed or used balances the interests of consumers with the potentially burdensome costs of deleting information from backup systems that may never be utilized.”

What is very notable is that the only technical exception to the CCPA is unrestored archived and back-up data. Like the GDPR, there is no distinction between unstructured and structured electronic data. In the first round of public comments, an insurance industry lobbying group argued that unstructured data be exempted from the CCPA. As reflected by revised guidance, that suggestion was rejected by the California AG.

For the GDPR, the UK ICO correctly advises that enterprises “should ensure that your information management systems are well-designed and maintained, so you can efficiently locate and extract information requested by the data subjects whose personal data you process and redact third party data where it is deemed necessary.” This is why Forrester Research notes that “Data Discovery and Classification are the foundation for GDPR compliance.”

Establish and Enforce Data Privacy Policies

So to achieve GDPR and CCPA compliance, organizations must first ensure that explicit policies and procedures are in place for handling personal information. Once established, it is important to demonstrate to regulators that such policies and procedures are being followed and operationally enforced. A key first step is to establish a data map of where and how personal data is stored in the enterprise. This exercise is actually required under the GDPR Article 30 documentation provisions.

An operational data audit and discovery capability across unstructured data sources allows enterprises to efficiently map, identify, and remediate personal information in order to respond to regulators and data subject access requests from EU and California citizens. This capability must be able to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of weeks or months as is the case with traditional crawling tools. This includes laptops of employees working from home.

These processes and capabilities are not only required for data privacy compliance but are also needed for broader information governance and security requirements, anti-fraud compliance, and e-discovery.

Implementing these measures proactively, with routine and consistent enforcement using solutions such as X1 Distributed GRC, will go a long way to mitigate risk, respond efficiently to data subject access requests, and improve overall operational effectiveness through such overall information governance improvements.

Leave a comment

Filed under CaCPA, compliance, Corporations, Cyber security, Cybersecurity, Data Audit, GDPR, Information Governance, Information Management, Uncategorized

Remote Collection: The Apple Pay of eDiscovery in a COVID-19 World

By: Craig Carpenter

I often continue doing things just because that’s the way I’ve always done them.  There is a level of comfort that comes from familiarity, and to be honest as I age I realize I can get more set in my ways (as my children often tell me), eschewing new ways of doing things – even if they are quicker or more efficient.  Sometimes it takes a major disruption to force change, as the eDiscovery market saw with accelerated adoption of Predictive Coding in the wake of the Great Recession.  This is true in many industries, including consumer products: witness the accelerated adoption of “contactless payment” like Apple Pay during the COVID-19 pandemic.  It has been available for years, but adopted mainly by younger generations while us old folks clung to credit cards and, in some cases, cash (gasp!).  But COVID-19 has changed this dynamic for many, myself included, as the prospect of touching a credit card machine is now unacceptable.  Whereas using Apple Pay was a ‘nice-to-have’ before COVID-19, it has become a ‘must-have’ now.  This type of resistance to change is arguably even more commonplace in the legal world, where convention and comfort often reign supreme.  How we have been conducting eDiscovery collection for years is a perfect example of clinging to outdated methods – but with the advent of COVID-19, this too is about to change for good.

Collection of digital evidence in legal proceedings was an implicit requirement under the Federal Rules of Civil Procedure (FRCP) long before it was codified explicitly in the 2006 amendments with the addition of Electronically Stored Information (ESI) under amended Rule 34(a)) as a “new” category.  I distinctly remember conducting discovery in 1998 and 1999 as a 3rd year law student and then 1st year associate for a Bay Area law firm: it was the proverbial “banker box” process, with all discovery in paper form.  In those days, even email messages and Word Perfect documents were simply printed out to be Bates stamped and reviewed in hard copy by hand.  Document review has always been tedious, but at least back then the volumes were significantly lower than they are these days.

During this timeframe, however, email and the dissemination of ever-greater volumes of electronic information it facilitated was exploding.  This, of course, meant that evidence (in the forensic context) and relevant information for eDiscovery was increasingly digital in nature.  So when discovery practitioners went looking for tools to help them preserve and collect digital information, where did they turn?  To the forensic world, of course, as the more stringent requirements and processes of criminal proceedings and evidence necessitated the development of such tools earlier than had been needed in civil discovery.  And if a tool was good enough for criminal proceedings, it should be plenty good enough for those in the civil world.  Thus, forensic tools like Guidance’s Encase® and AccessData’s FTK® which were built for law enforcement crossed over into the civil world.

However, the needs of the data collection process for civil discovery were and remain quite different from those of the criminal world:

  • On average civil discovery involves far more “custodians” (owners or stewards of information) than criminal proceedings, e.g. 5-15 custodians in civil matters vs. 1, maybe 2, in criminal
  • Whereas a typical criminal proceeding focuses on the communication media of one or occasionally a few alleged perpetrators (i.e. their cell phone, laptop, social media), civil discovery is typically significantly broader given the greater number of corporation applications and data repositories, including corporate email, file shares, ‘loose files’ (e.g. Word or Excel documents only stored locally), cloud storage repositories like Dropbox or Google Vault
  • Due to the larger number of custodians and typically broader data types to be searched, the volume of information in civil discovery is usually significantly greater than in a criminal proceeding
  • In handling criminal evidence there is a presumption that the alleged perpetrator may have tried to hide, alter or destroy evidence; absent very unusual circumstances, no such presumption exists in civil discovery
  • While confiscation of devices (laptops, desktops, cell phones, records) is the standard in criminal proceedings, the opposite is true in civil discovery. Custodians need their devices so they can do their jobs
  • Collection of evidence in criminal proceedings is handled by law enforcement (e.g. upon arrest or as part of a ‘dawn raid’ type of event), while the parties themselves conduct civil discovery (as a business process typically handled by legal or outsourced to service providers)

These differences were insignificant when data volumes were small and the data was relatively easy to get to, as was the case for many years.  And as the first technology on the market, forensic tools and vendors did a great job of building and defending their incumbency, through certifications, “court-cited workflows” and knowledge bases widely advertising their deep expertise in forensic collection as practiced by a cadre of forensic examiners leveraging their technical abilities into lucrative careers – thereby creating a significant barrier to entry for non-forensic eDiscovery collection tools and practitioners.

In spite of this strong incumbency, almost all corporate legal departments have long wanted a better approach to collection than forensic tools offered; many of their outside counsel have felt similarly.  They have long felt collection using forensic tools and workflows were and remain deeply flawed for eDiscovery in a number of ways:

  • Chronic overcollection: as forensic tools were built to capture all information, including things like slack space which can be important in criminal proceedings but are almost never even in scope in civil matters, the volume of data collected is far greater than needed. While service providers charging hourly professional services time and monthly per-GB hosting fees may not mind, for clients paying to collect/filter/host/review/produce knowingly unnecessary data this makes no sense and adds significant cost to the entire process, each and every time
  • Weeks or months-long process: because forensic tools must process data on a server before searching or culling it, they require physical access to a device (e.g. via a USB port). There is an option to copy entire drives with GBs of data through a VPN connection, but this approach has never worked well, if at all.  Given the coordination needed to gain physical access to devices which may be located in myriad different cities or countries, as well as the need to complete collection before paring down or even searching of data can begin, what should take hours or days instead takes weeks if not months
  • Highly disruptive: as each forensic image is being taken of each laptop or desktop, the user of each such machine must stop whatever they are doing and surrender their machine to the forensic staff for a day or more. Even if there is a spare laptop available, it will often have none of their ‘stuff’ on it.  Needless to say, this highly intrusive process makes each such worker far less productive and is very disruptive
  • “Recreating the wheel” every time: when the next matter arrives, can forensic examiners simply use the data from the last collection? Unfortunately, no, as each custodian has presumably created and received new data, necessitating the whole process from before be repeated.  Forensic collection quite literally recreates the wheel with every collection

By contrast, remote collection is designed specifically for civil eDiscovery.  It is built for a distributed workforce and requires no physical access to any devices.  A small software agent is installed on each device which creates its own local index; legal staff can then simply search this index for whatever ESI they want to find.  This distributed architecture facilitates ‘Pre-Case Assessment’, where search terms are sampled on data in-place, before any ESI is collected.  This turns the forensic collection workflow on its head, as analysis can be done from the very beginning of the preservation/collection process, allowing lawyers to gain insight far earlier in any proceeding and supporting a surgical collection process, leading to far lower data volumes (and therefore much lower eDiscovery costs).  And because remote collection can be an entirely cloud-based process, no hardware or specialized staff is required – in fact, collections can be done without IT ever being involved.

Why hasn’t the industry adopted remote collection before now?  Because everyone involved in the process except the client was benefited from it: forensic experts, service providers and forensic technology providers.  They had a strong incentive to keep things as they had always been, to the client’s detriment.  In a COVID-19 world, however, even these groups must change their workflows as physical access to devices has not only fallen out of favor – it is now impossible and perhaps even dangerous.  What remote employee would want a stranger to come to their home and take their laptop for hours?  That scenario is simply no longer an option.  Similarly to how touching a point-of-sale machine went from a minor inconvenience to a wildly irresponsible and even dangerous activity when Apple Pay is a far better approach, forensic collection in eDiscovery is in the process of giving way to remote collection.  Clients will be much better off for it.

Leave a comment

Filed under Best Practices, collection, Corporations, ESI, Information Access, Preservation & Collection, Uncategorized

How Case Teams Can Streamline Collections with X1 in RelativityOne

Editor’s’ Note: This article originally appeared on The Relativity Blog. It is reprinted here in full with permission. 

by Sam Bock on November 07, 2019

Our September 2019 release for RelativityOne debuted some game-changing functionality in the platform. Collect for RelativityOne enables fast, secure, and defensible collections right within the cloud, allowing RelativityOne users to pull data directly from Microsoft Office 365 without ever leaving the platform or Azure.

One of our developer partners—X1joined up with us on building this functionality, bringing their patented technology into Collect to help simplify traditionally complex workflows.

To get a better picture of just what Collect and X1 Distributed Discovery are capable of now that they’ve teamed up, we sat down with X1 Executive Chairman and Chief Legal Officer John Patzakis. Check out the most impactful takeaways from our conversation, and sign up for X1’s upcoming webinar to learn more.

Sam: What makes collection challenging for today’s legal teams?

John: Traditional e-discovery collection methods consist of either unsupervised custodian self-collection or manual services, driving up costs while increasing risk and disruption to business operations. On the other end of the spectrum, endpoint forensic imaging is burdensome, expensive, and not legally required for civil litigation discovery. Additionally, these manual and disjointed efforts are not technically integrated with Relativity, thus requiring multiple hand-offs, which increases risk, expense, and cumbersome project management efforts.

How does your team think creatively to tackle those challenges in the interest of conducting faster, more defensible collections for your customers?

We tackle collection from the enterprise and also enable significant scalability. X1 Distributed Discovery enables enterprises and their service providers to search, assess, and analyze electronically stored information (ESI) across hundreds or even thousands of custodians, enterprise-wide, where the data resides and before collection, with direct upload into Relativity. Instead of the expensive and disruptive “image then stage then process then load into review workspace” process, X1 Distributed Discovery allows for access to ESI where it sits within hours.

What sorts of variables exist in today’s collection workflows, and how does your team accommodate for those differences?

One of the biggest challenges with modern enterprise ESI collection comes from remote employees who only log into the network intermittently. Most network-enabled collection tools require custodians to be on the domain in order to work. However, X1 is architected to feature SSL security certificates—creating secure tunnels that enable collection from custodians wherever they are, including on WiFi in a Starbucks or on a plane.

Another key challenge is email collection. Traditional workflows often require collecting an entire PST email container or Exchange email account back to a central location for processing, identification, and preservation of potentially responsive email messages. This approach involves the transferring and processing of large files, which takes a lot of time, before even beginning to identify individually responsive email messages. Our solution eliminates the need to transfer entire email containers by allowing the identification and collection of individual messages in place on a custodian’s computer.

How is Collect for RelativityOne built to manage modern collections more effectively?

Collect integrates the X1 Distributed Discovery architecture to leverage patented search technology that indexes Microsoft Office 365 data directly on the laptop, desktop, or file server, allowing e-discovery, investigatory, or forensic professionals to globally query thousands of individual endpoints simultaneously. Individual emails and files can be identified by keyword, dates, and other metadata content without having to first retrieve the entire PST or ZIP across the network.

Collecting enterprise ESI can be one of the most daunting parts of the e-discovery process, and X1’s technical integration with RelativityOne seeks to make it less intimidating. The software helps streamline the e-discovery workflow by eliminating expensive and cumbersome processing steps and dramatically increasing speed to review. Collect for RelativityOne provides legal teams with a solution that compresses project timeframes; reduces risk by integrating collection with the rest of Relativity’s suite of features for review and analysis; and creating a repeatable process that helps reduce overall efforts and costs that might otherwise be spent outside of the platform. Additionally, the tight integration between X1’s technology and Relativity provides a unified chain of custody for optimal defensibility.

In short, we’re excited to see how this functionality, built into Relativity’s collection tool, can help revolutionize the current e-discovery process by collapsing the many hand-offs involved in the EDRM into a few short steps manageable by one or two people.

What tips and best practices would you share with a team conducting complex collections? How can they set themselves up for success from the start?

When collecting data, plan your collection criteria carefully. Focus on granular search criteria including file types, data ranges, and other key metadata in addition to detailed Boolean search terms to help your team strategically reduce collection volumes.

Sam Bock is a member of the marketing team at Relativity, and serves as editor of The Relativity Blog.

Leave a comment

Filed under collection, Corporations, eDiscovery, Enterprise eDiscovery, Uncategorized

Incident Reporting Requirements Under GDPR and CCPA Require Effective Incident Response

By John Patzakis

The European General Data Protection Regulation (GDPR) is now in effect, but many organizations have not fully implemented compliance programs. For many organizations, one of the top challenges is complying with the GDPR’s tight 72-hour data breach notification window. Under GDPR article 33, breach notification is mandatory where a data breach is likely to “result in a risk for the rights and freedoms of individuals.” This must be done within 72 hours of first having become aware of the breach.  Data processors will also be required to notify their customers, the controllers, “without undue delay” after first becoming aware of a data breach.GDPR-stamp

In order to comply, organizations must accelerate their incident response times to quickly detect and identify a breach within their networks, systems, or applications, and must also improve their overall privacy and security processes. Being able to follow the GDPR’s mandate for data breach reporting is equally important as being able to act quickly when the breach hits. Proper incident response planning and practice are essential for any privacy and security team, but the GDPR’s harsh penalties amplify the need to be prepared.

It is important, however, to note that the GDPR does not mandate reporting for every network security breach. It only requires reporting for breaches impacting the “personal data” of EU subjects. And Article 33 specifically notes that reporting is not required where “the personal data breach is unlikely to result in a risk to the rights and freedoms of natural persons.”

The California Consumer Privacy Act contains similar provisions. Notification is only required if a California resident’s data is actually compromised.

So after a network breach is identified, determining whether the personal data of an EU or California citizen was actually compromised is critical not only to comply where a breach actually occurred, but also limit unnecessary or over reporting where an effective response analysis can rule out an actual personal data breach.

These breaches are perpetrated by outside hackers, as well as insiders. An insider is any individual who has authorized access to corporate networks, systems or data.  This may include employees, contractors, or others with permission to access an organizations’ systems. With the increased volume of data and the increased sophistication and determination of attackers looking to exploit unwitting insiders or recruit malicious insiders, businesses are more susceptible to insider threats than ever before.

Much of the evidence of the scope of computer security incidents and whether subject personal data was actually compromised are not found in firewall logs and typically cannot be flagged or blocked by intrusion detection or intrusion prevention systems. Instead, much of that information is found in the emails and locally stored documents of end users spread throughout the enterprise on file servers and laptops. To detect, identify and effectively report on data breaches, organizations need to be able to search across this data in an effective and scalable manner. Additionally, proactive search efforts can identify potential security violations such as misplaced sensitive IP, or personal customer data or even password “cheat sheets” stored in local documents.

To date, organizations have employed limited technical approaches to try and identify unstructured distributed data stored across the enterprise, enduring many struggles. For instance, forensic software agent-based crawling methods are commonly attempted but cause repeated high computer resource utilization for each search initiated and network bandwidth limitations are being pushed to the limits rendering this approach ineffective, and preventing any compliance within tight reporting deadlines. So being able to search and audit across at least several hundred distributed end points in a repeatable and expedient fashion is effectively impossible under this approach.

What has always been needed is gaining immediate visibility into unstructured distributed data across the enterprise, through the ability to search and report across several thousand endpoints and other unstructured data sources, and return results within minutes instead of days or weeks. None of the traditional approaches come close to meeting this requirement. This requirement, however, can be met by the latest innovations in enterprise eDiscovery software.

X1 Distributed GRC  represents a unique approach, by enabling enterprises to quickly and easily search across multiple distributed endpoints from a central location.  Legal, cybersecurity, and compliance teams can easily perform unified complex searches across both unstructured content and metadata, and obtain statistical insight into the data in minutes, instead of days or weeks. With X1 Distributed GRC, organizations can proactively or reactively search for confidential data leakage and also keyword signatures of personal data breach attacks, such as customized spear phishing attacks. X1 is the first product to offer true and massively scalable distributed searching that is executed in its entirety on the end-node computers for data audits across an organization. This game-changing capability vastly reduces costs and quickens response times while greatly mitigating risk and disruption to operations.

Leave a comment

Filed under compliance, Corporations, Cyber security, Cybersecurity, Data Audit, GDPR, Information Governance

USDOJ Expects Companies to Proactively Employ Data Analytics to Detect Fraud

By John Patzakis and Craig Carpenter

In corporate fraud enforcement actions, The US Department of Justice considers the effectiveness of a company’s compliance program as a key factor when deciding whether to bring charges and the severity of any resulting penalties. Recently, prosecutors increased their emphasis on this policy with new evaluation guidelines about what prosecutors expect from companies under investigation.DOJ

The USDOJ manual features a dedicated section on assessing the effectiveness of corporate compliance programs in corporate fraud prosecutions, including FCPA matters. This section is a must read for any corporate compliance professional, as it provides detailed guidance on what the USDOJ looks for in assessing whether a corporation is committed to good-faith self-policing or is merely making hollow pronouncements and going through the motions.

The USDOJ manual advises prosecutors to determine if the corporate compliance program “is adequately designed for maximum effectiveness in preventing and detecting wrongdoing by employees and whether corporate management is enforcing the program or is tacitly encouraging or pressuring employees to engage in misconduct to achieve business objectives,” and that “[p]rosecutors should therefore attempt to determine whether a corporation’s compliance program is merely a ‘paper program’ or whether it was designed, implemented, reviewed, and revised, as appropriate, in an effective manner.”

Recently, Deputy Assistant Attorney General Matthew Miner provided important additional guidance through official public comments establishing that the USDOJ will be assessing whether compliance officers proactively employ data analytics technology in their reviews of companies that are under investigation.

Miner noted that the Justice Department has had success in spotting corporate fraud by relying on data analytics, and said that prosecutors expect compliance officers to do the same: “This use of data analytics has allowed for greater efficiency in identifying investigation targets, which expedites case development, saves resources, makes the overall program of enforcement more targeted and effective.” Miner further noted that he “believes the same data can tell companies where to look for potential misconduct.” Ultimately, the federal government wants “companies to invest in robust and effective compliance programs in advance of misconduct, as well as in a prompt remedial response to any misconduct that is discovered.”

Finally, “if misconduct does occur, our prosecutors are going to inquire about what the company has done to analyze or track its own data resources—both at the time of the misconduct, as well as at the time we are considering a potential resolution,” Miner said. In other words, companies must demonstrate a sincere commitment to identifying and investigating internal fraud with proper resources employing cutting edge technologies, instead of going through the motions with empty “check the box” processes.

With these mandates from government regulators for actual and effective monitoring and enforcement through internal investigations, organizations need effective and operational mechanisms for doing so. In particular, any anti-fraud and internal compliance program must have the ability to search and analyze unstructured electronic data, which is where much of the evidence of fraud and other policy violations can be best detected.

But to utilize data analytics platforms in a proactive instead of a much more limited reactive manner, the process needs to be moved “upstream” where unstructured data resides. This capability is best enabled by a process that extracts text from unstructured, distributed data in place, and systematically sends that data at a massive scale to an analytics platform, with the associated metadata and global unique identifiers for each item.  One of the many challenges with traditional workflows is the massive data transfer associated with ongoing data migration of electronic files and emails, the latter of which must be sent in whole containers such as PST files. This process alone can take weeks, choke network bandwidth and is highly disruptive to operations. However, the load associated with text/metadata only is less than 1 percent of the full native item. So the possibilities here are very compelling. This architecture enables very scalable and proactive solutions to compliance, information security, and information governance use cases. The upload to AI engines would take hours instead of weeks, enabling continual machine learning to improve processes and accuracy over time and enable immediate action to be taken on identified threats or otherwise relevant information.

The only solution that we are aware of that fulfills this vision is X1 Enterprise Distributed GRC. X1’s unique distributed architecture upends the traditional collection process by indexing at the distributed endpoints, enabling a direct pipeline of extracted text to the analytics platform. This innovative technology and workflow results in far faster and more precise collections and a more informed strategy in any matter.

Deployed at each end point or centrally in virtualized environments, X1 Enterprise allows practitioners to query many thousands of devices simultaneously, utilize analytics before collecting and process while collecting directly into myriad different review and analytics applications like RelativityOne and Brainspace. X1 Enterprise empowers corporate eDiscovery, compliance, investigative, cybersecurity and privacy staff with the ability to find, analyze, collect and/or delete virtually any piece of unstructured user data wherever it resides instantly and iteratively, all in a legally defensible fashion.

X1 displayed these powerful capabilities with Compliance DS in a recent webinar with a brief but substantive demo of our X1 Distributed GRC solution, emphasizing our innovative support of analytics engines through our game-changing ability to extract text in place with a direct feed into AI solutions.

Here is a link to the recording with a direct link to the 5 minute demo portion.

In addition to saving time and money, these capabilities are important to demonstrate a sincere organizational commitment to compliance versus maintaining a mere “paper program” – which the USDOJ has just said can provide critical mitigation in the event of an investigation or prosecution.

Leave a comment

Filed under Best Practices, compliance, Corporations, Data Audit, eDiscovery & Compliance, Information Governance