Tag Archives: eDiscovery

The Global De-Centralized Enterprise: An Un-Met eDiscovery Challenge

Enterprises with data situated within a multitude of segmented networks across North America and the rest of the world face unique challenges for eDiscovery and compliance-related investigation requirements. In particular, the wide area networks of large project engineering, oil & gas, and systems integration firms typically contain terabytes of geographically disparate information assets in often harsh operating environments with very limited network bandwidth. Information management and eDiscovery tools that require data centralization or run on expensive and inflexible hardware appliances cannot, by their very nature, address critical project information in places like Saudi Arabia, China, or the Alaskan North Slope.

Despite vendor marketing hype, network bandwidth constraints coupled with the requirement to migrate data to a single repository render traditional information management and eDiscovery tools ineffective to address de-centralized global enterprise data. As such, the global decentralized enterprise represents a major gap for in-house eDiscovery processes, resulting in significant expense and inefficiencies. The case of U.S. ex rel. McBride v. Halliburton Co. [1]  illustrates this pain point well. In McBride, Magistrate Judge John Facciola’s instructive opinion outlines Halliburton’s eDiscovery struggles to collect and process data from remote locations:

Since the defendants employ persons overseas, this data collection may have to be shipped to the United States, or sent by network connections with finite capacity, which may require several days just to copy and transmit the data from a single custodian . . . (Halliburton) estimates that each custodian averages 15–20 gigabytes of data, and collection can take two to ten days per custodian. The data must then be processed to be rendered searchable by the review tool being used, a process that can overwhelm the computer’s capacity and require that the data be processed by batch, as opposed to all at once. [2]

Halliburton represented to the court that they spent hundreds of thousands of dollars on eDiscovery for only a few dozen remotely located custodians. The need to force-collect the remote custodians’ entire set of data and then sort it out through the expensive eDiscovery processing phase instead of culling, filtering and searching the data at the point of collection drove up the costs.

Despite the burdens associated with the electronic discovery of distributed data across the four corners of the earth, such data is considered accessible under the Federal Rules of Civil Procedure and thus must be preserved and collected if relevant to a legal matter. However, the good news is that the preservation and collection efforts can and should be targeted to only potentially relevant information limited to only custodians and sources with a demonstrated potential connection to the litigation matter in question.

This is important as the biggest expense associated with eDiscovery is the cost of overly inclusive preservation and collection. Properly targeted preservation initiatives are permitted by the courts and can be enabled by adroit software that is able to quickly and effectively access and search these data sources throughout the enterprise. The value of targeted preservation is recognized in the Committee Notes to the FRCP amendments, which urge the parties to reach agreement on the preservation of data and the key words, date ranges and other metadata to identify responsive materials. [3]  And In re Genetically Modified Rice Litigation, the court noted that “[p]reservation efforts can become unduly burdensome and unreasonably costly unless those efforts are targeted to those documents reasonably likely to be relevant or lead to the discovery of relevant evidence.” [4]

However, such targeted collection and ECA in place is not feasible in the decentralized global enterprise with current eDiscovery and information management tools. What is needed to address these challenges for the de-centralized enterprise is a field-deployable search and eDiscovery solution that operates in distributed and virtualized environments on-demand within these distributed global locations where the data resides. In order to meet such a challenge, the eDiscovery and search solution must immediately and rapidly install, execute and efficiently operate in a localized virtualized environment, including public or private cloud deployments, where the site data is located, without rigid hardware requirements or on-site physical access.

This is impossible if the solution is fused to hardware appliances or otherwise requires a complex on-site installation process. After installation, the solution must be able to index the documents and other data locally and serve up those documents for remote but secure access, search and review through a web browser. As the “heavy lifting” (indexing, search, and document filtering) is all performed locally, this solution can effectively operate in some of the harshest local environments with limited network bandwidth. The data is not only collected and culled within the local area network, but is also served up for full early case assessment and first pass review on site, so that only a much smaller data set of potentially relevant data is ultimately transmitted to a central location.

This ground breaking capability is what X1 Rapid Discovery provides. Its ability to uniquely deploy and operate in the IaaS cloud also means that the solution can install anywhere within the wide-area network, remotely and on-demand. This enables globally decentralized enterprises to finally address their overseas data in an efficient, expedient defensible and highly cost effective manner.

If you have any thoughts or experiences with the unique eDiscovery challenges of the de-centralized global enterprise, feel free to email me. I welcome the collaboration.

___________________________________________

[1] 272 F.R.D. 235 (2011)

[2] Id at 240.

[3] Citing the Manual for Complex Litigation (MCL) (4th) §40.25 (2)):

[4] 2007 WL 1655757 (June 5, 2007 E.D.Mo.)

Leave a comment

Filed under eDiscovery & Compliance, Enterprise eDiscovery

Case Study: The Importance of Integrated Social Media and Website Crawling Collection

One of the benefits of the very strong market adoption of our X1 Social Discovery software is that we receive a significant amount of invaluable and excellent customer feedback from very seasoned eDiscovery and law enforcement professionals. Many of these experts report that a good number of their social media investigation and collection cases also require general website collection. For instance, a person on Facebook promoting infringing technology may also be posting relevant information to industry web bulletin boards or maintaining their own website. It is thus important that a social media eDiscovery and investigation process feature integrated web collection and social media support.

For an effective process, website data should be collected, searched and reviewed alongside social media collections in the same interface. The collected website data should not be a mere image capture or pdf, but a full HTML (native file) collection, to ensure preservation of all metadata and other source information as well as to enable instant and full search and effective evidentiary authentication. All of the evidence should be searched with one pass, reviewed, tagged and, if needed, exported to an attorney review platform from a single workflow.

To illustrate what this looks like in the field, we recorded an 8 minute demonstration based in part upon a real life example reported to us by one of our customers. This case study, performed by our CTO Brent Botta, involves the collection of social media data as well as message board posts on the web. Importantly, this evidence is consolidated into a unified workflow to be searched in one single pass.

The investigation features X1 Social Discovery as the platform, which now features automated and integrated web crawling capabilities in addition to its renowned functionality for the collection and analysis of Facebook and Twitter content. We believe this is the only solution of its kind to collect website evidence both through a one-off capture or full crawling, including on a scheduled basis, and have that information instantly reviewable in native file format through a federated search that includes multiple pieces of social media and website evidence in a single case. Up to millions of web captures and social media items are searched instantly with the patented X1 search, tagged and exported from a single interface.

Like social media content, web pages bring their own unique but important challenges for evidentiary authentication. In the next week, we will be posting on best practices for the collection and authentication of web pages as evidence, so stay tuned!

Leave a comment

Filed under Best Practices, Preservation & Collection

Judge Peck: Cloud For Enterprises Not Cost-Effective Without Efficient eDiscovery Process

Hon. Andrew J. Peck
United States Magistrate Judge

Federal Court Magistrate Judge Andrew Peck of the New York Southern District is known for several important decisions affecting the eDiscovery field including the ongoing  Monique da Silva Moore v. Publicis Group SA, et al, case where he issued a landmark order authorizing the use of predictive coding, otherwise known as technology assisted review. His Da Silva Moore ruling is clearly an important development, but also very noteworthy are Judge Peck’s recent public comments on eDiscovery in the cloud.

eDiscovery attorney Patrick Burke, a friend and former colleague at Guidance Software, reports on his blog some interesting comments asserted on the May 22 Judges panel session at the 2012 CEIC conference. UK eDiscovery expert Chris Dale also blogged about the session, where Judge Peck noted that data stored in the cloud is considered accessible data under the Federal Rules of Civil Procedure (see, FRCP Rule 26(b)(2)(B)) and thus treated no differently by the courts in terms of eDiscovery preservation and production requirements as data stored within a traditional network. This brought the following cautionary tale about the costs associated with not having a systematic process for eDiscovery:

Judge Peck told the story of a Chief Information Security Officer who had authority over e-discovery within his multi-billion dollar company who, when told that the company could enjoy significant savings by moving to “the cloud”, questioned whether the cloud provider could accommodate their needs to adapt cloud storage with the organization’s e-discovery preservation requirements. The cloud provider said it could but at such an increased cost that the company would enjoy no savings at all if it migrated to the cloud.

In previous posts on this blog, we outlined how significant cost-benefits associated with cloud migration can be negated when eDiscovery search and retrieval of that data is required.  If an organization maintains two terabytes of documents in the Amazon or other IaaS cloud deployments, how do they quickly access, search, triage and collect that data in its existing cloud environment if a critical eDiscovery or compliance search requirement suddenly arises?  This is precisely the reason why we developed X1 Rapid Discovery, version 4. X1RD is a proven and now truly cloud-deployable eDiscovery and enterprise search solution enabling our customers to quickly identify, search, and collect distributed data wherever it resides in the Infrastructure as a Service (IaaS) cloud or within the enterprise. While it is now trendy for eDiscovery software providers to re-brand their software as cloud solutions, X1RD is now uniquely deployable anywhere, anytime in the IaaS cloud within minutes. X1RD also features the ability to leverage the parallel processing power of the cloud to scale up and scale down as needed. In fact, X1RD is the first pure eDiscovery solution (not including a hosted email archive tool) to meet the technical requirements and be accepted into the Amazon AWS ISV program.

As far as the major cloud providers, the ones who choose to solve this eDiscovery challenge (along with effective enterprise search) with best practices technology will not only drive significant managed services revenue but will enjoy a substantial competitive advantage over other cloud services providers.

1 Comment

Filed under Best Practices, Case Law, Cloud Data, Enterprise eDiscovery, IaaS, Preservation & Collection

Social Media Case Law Update: Volume of Cases Accelerating

Recently our survey of published case law from 2010 and 2011 identified 689 cases involving social media evidence for that time period.  While these results exceeded our expectations, that pace is actually rapidly accelerating in 2012. For this past April alone, a quick tally identifies 61 cases where social media evidence played a key role. We will have a mid-year report in a few months, but it appears that the volume of cases has about doubled year over year. Keep in mind that the survey group only involves published cases on Westlaw. With less than one percent of total cases resulting in published opinions, and considering this data set does not take into account internal or compliance investigations or non-filed criminal cases, we can safely assume that there were tens of thousands more legal matters involving social media evidence that were adjudicated or otherwise resolved in April 2012.

The following are brief synopses of three of the more notable social media cases from April:

Blandv. Roberts, 2012 WL 1428198 (E.D.  VA, Apr. 24, 2012)  

This case is notable in that it extensively litigated the implications of “liking” specific items on Facebook.  In this situation the Hampton, Virginia Sheriff’s Office employed Bland and his co-workers, under Sheriff B.J.  Roberts. Roberts faced a contested election and Bland and his cohorts backed the challenger Jim Adams, going so far as to “like” Adam’s Facebook page. As it turned out, the plaintiffs “liked” the wrong horse. Roberts won the election, and he subsequently fired Bland and the other Adams-backers. The Sheriff justified the terminations on cost-cutting grounds, but plaintiffs argued that their termination violated their First Amendment rights, as Roberts was aware that the plaintiffs’ “liked” Adam’s Facebook page, which plaintiff’s asserted to be protected speech. The court ultimately determined that “merely ‘liking’ a Facebook page is insufficient speech to merit constitutional protection and thus the termination was lawful.

From our perspective, the ultimate outcome of Bland v. Roberts is not so much the point as is plaintiffs’ subtle activity on Facebook representing substantive facts of the case.  The act of liking a Facebook entry can be an important piece of evidence in a wide variety of litigation and investigation scenarios. Just to identify a few possible examples, it can constitute evidence toward a party’s knowledge of a particular fact, or the extent of trademark infringement or publication of defamatory material, or identify relevant witnesses in a case. This case illustrates why it is important to collect and preserve all available information on Facebook and other social media sites in a thorough manner with best-practices technology specifically designed for litigation purposes.

People v. Harris, 2012 WL 1381238 (N.Y. Crim. Ct. Apr. 20, 2012)

In this case, the defendant faced charges of disorderly conduct after marching onto the Brooklyn Bridge as a participant in the Occupy Wall Street protests.  The New York District Attorney’s Office subpoenaed Twitter, Inc., seeking user information and Tweets from a particular time period for the Twitter account @destructuremal—the account allegedly used by the defendant.  The defendant filed a motion to quash the subpoena.

In denying the defendant’s motion, the court relied heavily on the public nature of Twitter and its terms of service, which establish that users have no expectation of privacy and no proprietary interest in their Tweets. The court noted that the terms of service state that by submitting a post or displaying content, a user has granted Twitter “a worldwide, non-exclusive, royalty-free license to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).”  Thus, the court reasoned, “defendant’s inability to preclude Twitter’s use of his Tweets demonstrates a lack of proprietary interest” in them.  In assessing the Plaintiff’s privacy rights, the court again relied on Twitter’s Terms of Service, which clearly inform users that their information will be viewable by others and which specifically state that “[w]hat you say on Twitter may be viewed all around the world instantly … [t]his license is you authorizing us to make your Tweets available to the rest of the world and to let others do the same.”

Loporcaro v. City of New York and Perfetto Contracting Company,  35 Misc.3d 1209(A), (N.Y. Sup. Ct. Apr. 9, 2012)

This is yet another serious personal injury claim where the claimant’s public Facebook postings contradicted their assertions of serious injury. Plaintiff claimed permanent disability from two knee injuries while on the job as a firefighter, seeking redress against Perfetto Contracting Company, Inc., alleging defective road conditions caused his injury. However, his public Facebook postings suggested that he continued to maintain an active lifestyle. This prompted the court to grant the defense’s motion to compel production of the Plaintiff’s full Facebook account, ruling as follows:

“When a person creates a Facebook account, he or she may be found to have consented to the possibility that personal information might be shared with others, notwithstanding his or her privacy settings, as there is no guarantee that the pictures and information posted thereon, whether personal or not, will not be further broadcast and made available to other members of the public. Clearly, our present discovery statutes do not allow that the contents of such accounts should be treated differently from the rules applied to any other discovery material, and it is impossible to determine at this juncture whether any such disclosures may prove relevant to rebut plaintiffs’ claims regarding, e.g., the permanent effects of the subject injury. Since it appears that plaintiff has voluntarily posted at least some information about himself on Facebook which may contradict the claims made by him in the present action, he cannot claim that these postings are now somehow privileged or immune from discovery.”

Earlier this year we covered the case of Tompkins vs. Detroit Metropolitan Airport, which also highlighted the importance of systematic search of public Facebook as standard procedure for nearly every type of criminal and civil litigation investigation.

We will have an update in about four weeks for the social case law published in May, so stay tuned.

4 Comments

Filed under Best Practices, Case Law

Defining Truly Cloud-Capable eDiscovery Software

Last week we discussed the challenges of searching and collecting data in Infrastructure as a Service (IaaS) cloud deployments (such as the Amazon cloud or Rackspace) for eDiscovery purposes.  Today we discuss what is needed for eDiscovery and enterprise search vendors to provide a truly cloud-capable solution and provide a decoder ring of sorts to cut through the hype.  For there is a lot of hype with the cloud becoming the latest eDiscovery hot button, with vendor marketing claims far surpassing actual capabilities.

In fact, many eDiscovery and enterprise software vendors claim to support the cloud, but are simply re-branding their long-existing SaaS offerings, which really has nothing to do with supporting IaaS. Barry Murphy of the eDiscovery Journal aptly identified this marketing practice as “cloud washing.” Data hosting, especially where the vendor’s manual labor is routinely required to upload and process data, does not meet defined cloud standards. Neither does a process that primarily exports data through APIs or other means out of its resident cloud environment to slowly migrate the cloud data to the vendor tools, instead of deploying the tools (and their processing power) to the data where it resides in the cloud. In order to truly support IaaS cloud deployments, eDiscovery and enterprise search software must meet the following three core requirements:

1.         Automated installation and virtualization:  The eDiscovery and search solution must immediately and rapidly install, execute and efficiently operate in a virtualized environment without rigid hardware requirements or on-site physical access. This is impossible if the solution is fused to hardware appliances or otherwise requires a complex on-site installation process. As hardware appliance solutions by definition are not cloud deployable and with enterprise search installations often requiring many months of man hours to install and configure, whether many of these vendors will be able to support robust IaaS cloud deployments in the reasonably foreseeable future is a significant question.

2.         On-demand self-service: In its definition of cloud computing, The National Institute of Standards and Technology (NIST) identifies on- demand self-service as an essential characteristic of the cloud where a “consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.”

Many hosted eDiscovery services require shipping of data to the provider or extensive behind the scenes manual labor to load and configure the systems for data ingestion. Conversely, solutions that truly support cloud IaaS will spin up, ingest data and fully operate in an automated fashion without the need for manual on-premise labor for configuration or data import.

3.         Rapid elasticity: NIST describes this characteristic as capabilities that “scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.” This important benefit of cloud computing is accomplished by a parallelized software architecture designed to dynamically scale out over potentially several dozen virtualized servers to enable rapid ingestion, processing and analysis of data sets in that cloud environment. This capability would allow several terabytes of data to be indexed and processed within 2 to 4 hours on a highly automated basis at far less cost than non-cloud eDiscovery efforts.

However, many characteristics of leading eDiscovery solutions fundamentality prevent their ability to support this core cloud requirement. Most eDiscovery early case assessment solutions are developed and configured toward a monolithic processing schema designed to operate on a single expensive hardware apparatus. While recently spawning some bold marketing claims of high speeds and feeds, such architecture is very ill-suited to the cloud, which is powered by highly distributed processing across multitudes of servers. Additionally, many of the leading eDiscovery and enterprise search solutions are tightly integrated with third party databases and other OEM technology that cannot be easily decoupled (and also present possible licensing constraints) making such elasticity physically and even legally impossible.

So is there eDiscovery software that will truly support the IaaS cloud based upon these requirements, and address up to terabytes of data?  Stay tuned….

Leave a comment

Filed under Cloud Data, Enterprise eDiscovery, IaaS