Tag Archives: IaaS

Cloud Search Is Important, But Only A Piece Of The Enterprise Search Puzzle

by Barry Murphy

In an earlier post, I described the importance of having the ability to quickly search for information stored in the Cloud.  The post pointed out that Cloud search is somewhat more complicated than one might think at first glance because the speed of search is affected by how close the index lives to the actual data in the Cloud infrastructure.  One comment I received was that Cloud search can be fast and simple if the Cloud vendor promises a certain service level for query times and results.  That can address part of the issue around search (although IaaS providers – what we are truly talking about when we say “Cloud” – are typically not interested in guaranteeing SLAs for things like search because they allow customers to provision their infrastructure set to enable fast search with products like X1 Rapid Discovery).  Even if a Cloud vendor were to guarantee phenomenal search SLAs, the issue of unified enterprise search of all information still remains.

The reality is that enterprises and government agencies store information in “hybrid” environments that encompass on-premise systems within corporate data centers, virtualized systems that companies operate, and Cloud-based repositories.  Research firm Gartner predicts that by 2017, half of mainstream enterprises will have a hybrid cloud.  And, research from NetApp shows that organizations will be managing data across multiple cloud environments, not just a single provider, per se.

Click image to enlarge

Click image to enlarge

These are exciting developments.  As organizations embrace more modern infrastructures, there are many benefits to be had.  What we need to remember, however, is that business professionals still need to quickly find and take action on their information assets to do their jobs.  As that information gets further scattered, enterprise search will take on increased importance.  Workers don’t care if their data is stored on-premise or in the Cloud as long as they can quickly find it in an easy-to-use interface.

The challenge for today’s organizations is that information now lives in multiple infrastructures – on-premise, virtual, Cloud, or most frequently, a hybrid of all of these.  Current approaches to including Cloud-based data in enterprise search and eDiscovery require downloading a copy of the data to search so that it resides alongside other local content.  Unfortunately, that defeats the purpose of storing the data in the Cloud in the first place.

This takes me back to my original point:  Cloud search is very important.  But, Cloud search cannot simply exist in a vacuum.  An effective enterprise search solution will combine on-premise search capabilities that can talk to search in the Cloud – without requiring downloading the cloud-based information in order to search across all data.

Leave a comment

Filed under Cloud Data, Enterprise Search

“Act Reasonably” — Two Court-Issued Checklists Outlining Defensible, Targeted ESI Collection

Recently two separate and prominent courts — the federal court for the Northern District of California and the Delaware Court of Chancery (which is the primary court of equity for Delaware registered corporations) issued eDiscovery preservation guidelines. This is not unprecedented as other courts have issued similar written guidance in the form of general guidance or even more enforceable local rules of court specifically addressing eDiscovery protocols. What I found particularly interesting, however, is both courts provided fairly specific guidance on the scope of collection and preservation. In the case of the California court, which notes that its “guidelines are designed to establish best practices for evidence preservation in the digital age,” the Court offers a checklist for Rule 26(f) “meet and confer” conferences with good detail on suggested ESI preservation protocols. The Delaware Court of Chancery also issued a detailed checklist or “sample collection outline.” ESI preservation checklists are useful practice guides, and these are sanctioned by two separate influential courts.

This is important as the largest expense directly associated with eDiscovery is the cost of overly inclusive preservation and collection, which leads to increased volume charges and attorney review costs. To the surprise of many, properly targeted preservation initiatives are permitted by the courts and can be enabled by adroit software that is able to quickly and effectively access and search these data sources throughout the enterprise.

The value of targeted preservation is recognized in the Committee Notes to the FRCP amendments, which urge the parties to reach agreement on the preservation of data and the keywords used to identify responsive materials. (Citing the Manual for Complex Litigation (MCL) (4th) §40.25 (2)).  And In re Genetically Modified Rice Litigation, 2007 WL 1655757 (June 5, 2007 E.D.Mo.), the court noted that “[p]reservation efforts can become unduly burdensome and unreasonably costly unless those efforts are targeted to those documents reasonably likely to be relevant or lead to the discovery of relevant evidence.”

The checklist from the California Northern District and the guidelines issued by the Delaware court are consistent with these principles as they call for the specification of date ranges, custodian names and search terms for any ESI to be preserved. The Northern District checklist, for instance, provides for the identification of specific custodians and job titles of custodians whose ESI is to  be preserved, and also specific search phrases search terms “that will be used to identify discoverable ESI and filter out ESI that is not subject to discovery.”

However, many lawyers shy away from a targeted collection strategy over misplaced defensibility concerns, optioning instead for full disk imaging and other broad collection efforts that exponentially escalate litigation costs. The fear by some is that there always may be that one document that could be missed. However, in my experience of following eDiscovery case law over the past decade, the situations where litigants face exposure on the preservation front typically involve an absence of a defensible process. When courts sanction parties, it is usually because there is not a reasonable legal hold procedure in place, where the process is ad hoc and made up on the fly and/or not effectively executed. I am personally unaware of a published decision involving a fact pattern where a company featured a reasonable collection and preservation process involving targeted collection executed pursuant to standard operating procedures, yet was sanctioned because one or two relevant documents slipped through the cracks.

This is because the duty to preserve requires reasonable efforts, not infallible means, to collect potentially relevant information. As succinctly stated by the Delaware court: “Parties are not required to preserve every shred of information. Act reasonably.”

Another barrier standing in the way of defensible and targeted collection is that searching and performing early case assessment at the point of collection is not feasible in the decentralized global enterprise with traditional eDiscovery and information management tools. What is needed to address these challenges for the de-centralized enterprise is a field-deployable search and eDiscovery solution that operates in distributed and virtualized environments on-demand within these distributed global locations where the data resides. In order to meet such a challenge, the eDiscovery and search solution must immediately and rapidly install, execute and efficiently operate locally, including in a virtual environment, where the site data is located, without rigid hardware requirements or on-site physical access.

This ground breaking capability is what X1 Rapid Discovery provides. Its ability to uniquely deploy and operate in the IaaS cloud also means that the solution can install anywhere within the wide-area network, remotely and on-demand. Importantly, the search index is created virtually in the location proximity of the data subject to collection. This enables even globally decentralized enterprises to perform targeted search and collection efforts in an efficient, defensible and highly cost effective manner. Or, in the words of the Delaware court — the ability to act reasonably.

Leave a comment

Filed under Case Law, Cloud Data, Enterprise eDiscovery, IaaS

Judge Peck: Cloud For Enterprises Not Cost-Effective Without Efficient eDiscovery Process

Hon. Andrew J. Peck
United States Magistrate Judge

Federal Court Magistrate Judge Andrew Peck of the New York Southern District is known for several important decisions affecting the eDiscovery field including the ongoing  Monique da Silva Moore v. Publicis Group SA, et al, case where he issued a landmark order authorizing the use of predictive coding, otherwise known as technology assisted review. His Da Silva Moore ruling is clearly an important development, but also very noteworthy are Judge Peck’s recent public comments on eDiscovery in the cloud.

eDiscovery attorney Patrick Burke, a friend and former colleague at Guidance Software, reports on his blog some interesting comments asserted on the May 22 Judges panel session at the 2012 CEIC conference. UK eDiscovery expert Chris Dale also blogged about the session, where Judge Peck noted that data stored in the cloud is considered accessible data under the Federal Rules of Civil Procedure (see, FRCP Rule 26(b)(2)(B)) and thus treated no differently by the courts in terms of eDiscovery preservation and production requirements as data stored within a traditional network. This brought the following cautionary tale about the costs associated with not having a systematic process for eDiscovery:

Judge Peck told the story of a Chief Information Security Officer who had authority over e-discovery within his multi-billion dollar company who, when told that the company could enjoy significant savings by moving to “the cloud”, questioned whether the cloud provider could accommodate their needs to adapt cloud storage with the organization’s e-discovery preservation requirements. The cloud provider said it could but at such an increased cost that the company would enjoy no savings at all if it migrated to the cloud.

In previous posts on this blog, we outlined how significant cost-benefits associated with cloud migration can be negated when eDiscovery search and retrieval of that data is required.  If an organization maintains two terabytes of documents in the Amazon or other IaaS cloud deployments, how do they quickly access, search, triage and collect that data in its existing cloud environment if a critical eDiscovery or compliance search requirement suddenly arises?  This is precisely the reason why we developed X1 Rapid Discovery, version 4. X1RD is a proven and now truly cloud-deployable eDiscovery and enterprise search solution enabling our customers to quickly identify, search, and collect distributed data wherever it resides in the Infrastructure as a Service (IaaS) cloud or within the enterprise. While it is now trendy for eDiscovery software providers to re-brand their software as cloud solutions, X1RD is now uniquely deployable anywhere, anytime in the IaaS cloud within minutes. X1RD also features the ability to leverage the parallel processing power of the cloud to scale up and scale down as needed. In fact, X1RD is the first pure eDiscovery solution (not including a hosted email archive tool) to meet the technical requirements and be accepted into the Amazon AWS ISV program.

As far as the major cloud providers, the ones who choose to solve this eDiscovery challenge (along with effective enterprise search) with best practices technology will not only drive significant managed services revenue but will enjoy a substantial competitive advantage over other cloud services providers.

Leave a comment

Filed under Best Practices, Case Law, Cloud Data, Enterprise eDiscovery, IaaS, Preservation & Collection

X1 Rapid Discovery: First Enterprise eDiscovery Solution Supporting IaaS Cloud

Today I am pleased to announce our launch of  X1 Rapid Discovery, version 4. X1RD is a proven and now truly cloud-deployable eDiscovery and enterprise search solution enabling our customers to quickly identify, search, and collect distributed data wherever it resides in the Infrastructure as a Service (IaaS) cloud or within the enterprise. X1RD is a sister product to our acclaimed X1 Social Discovery, which we launched last year. Version 3 of X1 Rapid Discovery is a proven early case assessment and enterprise search application, but is now IaaS cloud deployable and features a new interface.

I know what you may be thinking — another eDiscovery CEO re-branding the company’s software as cloud. But hear me out on this. Sure, X1RD can serve as a hosted SaaS solution like many other tools (SaaS hosting has been around for over a decade), but the big news here is that X1RD is now deployable anywhere, anytime in the IaaS cloud within minutes. X1RD also features the ability to leverage the parallel processing power of the cloud to scale up and scale down as needed. In fact, X1RD is the first pure eDiscovery solution (not including a hosted email archive tool) to meet the technical requirements and be accepted into the Amazon AWS ISV program.

So what does this mean? Allow me to illustrate these ground-breaking capabilities through the following two growingly common scenarios faced by organizations today:

Scenario 1: A F1000 company maintains 2 terabytes of data up in the Amazon EC2 or S3 (storage) cloud and suddenly must find the comparatively small amount of relevant data within those 2TB as quickly as possible to respond to a critical investigation requirement. There is no time to spend several weeks downloading the entire 2TB out of the cloud through the thin pipe or waiting for Amazon personnel to copy the entire data set to hard drives and ship it back. What is urgently needed is the ability to quickly install eDiscovery software to index, search and review that data in the very IaaS cloud environment where it exists. That way only the small data set (say 10 gigabytes) of relevant data is identified and then finally exported. That is what X1 Rapid Discovery delivers.

Scenario 2: The same investigation sends the company’s eDiscovery consultant overseas to collect data at a subsidiary site. Upon the collection of the first 200 gigabytes, the attorneys insist  that the data must be quickly indexed for detailed, iterative searching in order to better inform the remaining on-site collection effort. However, the collection team left their large ECA appliance they normally use at home as it doesn’t travel well nor would it pass foreign customs. However, in this case there are several options with X1RD. If an eDiscovery software solution is truly a cloud-capable solution, then it can quickly install anywhere, including the IaaS cloud or on available hardware on-site. So the team can either locate available hardware resources with Windows OS or upload the data to a private or public IaaS cloud environment and operate a virtual eDiscovery lab with X1RD.

X1RD can just as easily be installed behind the firewall as in the cloud, but right now, all of our demos and proof of concepts are being performed in the IaaS cloud. But don’t just take our word for it, we would be happy to demonstrate this for you by remotely installing in your public or private IaaS cloud environment and collecting, indexing and searching your data. We are up for the challenge!

> Register for our live webinar on May 2 to see a demo of X1 Rapid Discovery and to hear from eDiscovery expert, Barry Murphy, on his view of the current eDiscovery market, with respect to the cloud.

Leave a comment

Filed under Cloud Data, eDiscovery & Compliance, Enterprise eDiscovery, IaaS

Defining Truly Cloud-Capable eDiscovery Software

Last week we discussed the challenges of searching and collecting data in Infrastructure as a Service (IaaS) cloud deployments (such as the Amazon cloud or Rackspace) for eDiscovery purposes.  Today we discuss what is needed for eDiscovery and enterprise search vendors to provide a truly cloud-capable solution and provide a decoder ring of sorts to cut through the hype.  For there is a lot of hype with the cloud becoming the latest eDiscovery hot button, with vendor marketing claims far surpassing actual capabilities.

In fact, many eDiscovery and enterprise software vendors claim to support the cloud, but are simply re-branding their long-existing SaaS offerings, which really has nothing to do with supporting IaaS. Barry Murphy of the eDiscovery Journal aptly identified this marketing practice as “cloud washing.” Data hosting, especially where the vendor’s manual labor is routinely required to upload and process data, does not meet defined cloud standards. Neither does a process that primarily exports data through APIs or other means out of its resident cloud environment to slowly migrate the cloud data to the vendor tools, instead of deploying the tools (and their processing power) to the data where it resides in the cloud. In order to truly support IaaS cloud deployments, eDiscovery and enterprise search software must meet the following three core requirements:

1.         Automated installation and virtualization:  The eDiscovery and search solution must immediately and rapidly install, execute and efficiently operate in a virtualized environment without rigid hardware requirements or on-site physical access. This is impossible if the solution is fused to hardware appliances or otherwise requires a complex on-site installation process. As hardware appliance solutions by definition are not cloud deployable and with enterprise search installations often requiring many months of man hours to install and configure, whether many of these vendors will be able to support robust IaaS cloud deployments in the reasonably foreseeable future is a significant question.

2.         On-demand self-service: In its definition of cloud computing, The National Institute of Standards and Technology (NIST) identifies on- demand self-service as an essential characteristic of the cloud where a “consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.”

Many hosted eDiscovery services require shipping of data to the provider or extensive behind the scenes manual labor to load and configure the systems for data ingestion. Conversely, solutions that truly support cloud IaaS will spin up, ingest data and fully operate in an automated fashion without the need for manual on-premise labor for configuration or data import.

3.         Rapid elasticity: NIST describes this characteristic as capabilities that “scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.” This important benefit of cloud computing is accomplished by a parallelized software architecture designed to dynamically scale out over potentially several dozen virtualized servers to enable rapid ingestion, processing and analysis of data sets in that cloud environment. This capability would allow several terabytes of data to be indexed and processed within 2 to 4 hours on a highly automated basis at far less cost than non-cloud eDiscovery efforts.

However, many characteristics of leading eDiscovery solutions fundamentality prevent their ability to support this core cloud requirement. Most eDiscovery early case assessment solutions are developed and configured toward a monolithic processing schema designed to operate on a single expensive hardware apparatus. While recently spawning some bold marketing claims of high speeds and feeds, such architecture is very ill-suited to the cloud, which is powered by highly distributed processing across multitudes of servers. Additionally, many of the leading eDiscovery and enterprise search solutions are tightly integrated with third party databases and other OEM technology that cannot be easily decoupled (and also present possible licensing constraints) making such elasticity physically and even legally impossible.

So is there eDiscovery software that will truly support the IaaS cloud based upon these requirements, and address up to terabytes of data?  Stay tuned….

Leave a comment

Filed under Cloud Data, Enterprise eDiscovery, IaaS