Tag Archives: cloud-based data

Cloud Search Is Important, But Only A Piece Of The Enterprise Search Puzzle

by Barry Murphy

In an earlier post, I described the importance of having the ability to quickly search for information stored in the Cloud.  The post pointed out that Cloud search is somewhat more complicated than one might think at first glance because the speed of search is affected by how close the index lives to the actual data in the Cloud infrastructure.  One comment I received was that Cloud search can be fast and simple if the Cloud vendor promises a certain service level for query times and results.  That can address part of the issue around search (although IaaS providers – what we are truly talking about when we say “Cloud” – are typically not interested in guaranteeing SLAs for things like search because they allow customers to provision their infrastructure set to enable fast search with products like X1 Rapid Discovery).  Even if a Cloud vendor were to guarantee phenomenal search SLAs, the issue of unified enterprise search of all information still remains.

The reality is that enterprises and government agencies store information in “hybrid” environments that encompass on-premise systems within corporate data centers, virtualized systems that companies operate, and Cloud-based repositories.  Research firm Gartner predicts that by 2017, half of mainstream enterprises will have a hybrid cloud.  And, research from NetApp shows that organizations will be managing data across multiple cloud environments, not just a single provider, per se.

Click image to enlarge

Click image to enlarge

These are exciting developments.  As organizations embrace more modern infrastructures, there are many benefits to be had.  What we need to remember, however, is that business professionals still need to quickly find and take action on their information assets to do their jobs.  As that information gets further scattered, enterprise search will take on increased importance.  Workers don’t care if their data is stored on-premise or in the Cloud as long as they can quickly find it in an easy-to-use interface.

The challenge for today’s organizations is that information now lives in multiple infrastructures – on-premise, virtual, Cloud, or most frequently, a hybrid of all of these.  Current approaches to including Cloud-based data in enterprise search and eDiscovery require downloading a copy of the data to search so that it resides alongside other local content.  Unfortunately, that defeats the purpose of storing the data in the Cloud in the first place.

This takes me back to my original point:  Cloud search is very important.  But, Cloud search cannot simply exist in a vacuum.  An effective enterprise search solution will combine on-premise search capabilities that can talk to search in the Cloud – without requiring downloading the cloud-based information in order to search across all data.

Leave a comment

Filed under Cloud Data, Enterprise Search

eDiscovery Search and Collection in the Cloud

After several dozen posts on social media eDiscovery, we are going to focus the next few weeks on the related issue of eDiscovery in the cloud. As we see it, despite the enormous cost benefits of the cloud, concerns about the feasibility of eDiscovery and general search across an organization’s critical cloud-resident data has to some degree prevented broader adoption.

The cloud means many things to many people, but I believe the real eDiscovery action (and pain point) is in Infrastructure as a Service (IaaS) cloud deployments (such as the Amazon cloud, Rackspace, or pure enterprise cloud providers such as Fujitsu). According to a recent PwC report, Cloud IaaS will account for 30% of IT expenditures by 2014.  IaaS currently provides the means for organizations to aggressively store and virtualize their enterprise data and software, thus potentially spawning the same large data volumes and requiring the same critical search and eDiscovery requirements as traditional enterprise environments.  Amazon Web Services, the leading IaaS cloud provider, reports in our discussions with them extensive customer eDiscovery requirements that are currently addressed by inefficient and manual means.  So for purposes of this discussion, IaaS, which is essentially cloud for the enterprise and where there is a current significant eDiscovery challenge, is what we will focus on.

So if an organization maintains two terabytes of documents in the Amazon or Rackspace cloud, how do they quickly access, search, triage and collect that data in its existing cloud environment if a critical eDiscovery or compliance search requirement suddenly arises? This scenario is a current significant pain point for IaaS cloud.  In such situations, the organization is typically resorting to one of two agonizingly inefficient processes. The first option involves shipping the provider hard drives for their IT staff to copy the data in bulk for download and having that data shipped back. Rackspace’s guidelines provide that a transfer of 2 terabytes of bulk files would cost over $10,000 in fees and require about four to six weeks. And then all the company gets is a full 2 terabyte duplicate of its data that still must be searched, processed and reviewed.

The other alternative is to slowly download the data through a secure file transfer protocol connection. However, even with a robust T2 line, it would take three to six weeks to transfer the two TBs, depending on how much dedicated bandwidth IT would be willing to dedicate to the exercise.

So what is needed is robust eDiscovery software that can truly support the IaaS cloud where the data resides without first requiring mass data export. We will discuss what that entails and the requirements of truly cloud capable eDiscovery software in our next post, so please stay tuned!

Leave a comment

Filed under Cloud Data, IaaS

The Future for eDiscovery: Social Media and the Cloud

Greetings and welcome to all. This is the inaugural post of Next Generation eDiscovery, a blog that will focus on legal, technical and compliance issues related to the collection, preservation and early case assessment of social media and other cloud-based data. To provide some context, the team here at X1 Discovery is experienced in developing and supporting technology for collecting electronic evidence in the enterprise to meet eDiscovery and investigation requirements. Many of us hail from Guidance Software, the developer of EnCase, which is the leading eDiscovery and investigative solution for collecting from hard drives, both standalone and within the enterprise. And now we turn our focus to current trends and the future.

And the future for eDiscovery is about social media and the cloud. In fact, it seems like just this year when social media became a compelling issue in eDiscovery and is reaching critical mass given the level of rising discourse. With over 700 million Facebook users and 200 million people with Twitter accounts, evidence from social media sites can be relevant to just about every litigation dispute and investigation matter. Social media evidence is widely discoverable and generally not subject to privacy constraints when established to be relevant to a case, particularly when that data is held by a party to litigation or even a key witness.

It seems like in recent months there has been much talk in the eDiscovery and digital investigation fields about social media, mostly outlining the scope of the problem and the need to put corporate policies and procedures in place concerning social media.  That discussion is an important first step, but it’s time for actual solutions in terms of technical, legal and investigation techniques. This blog will seek to identify and foster discussion points, educate, and even pontificate at times but also learn from our readers, customers and non-customers alike. We look forward to the dialogue.

Leave a comment

Filed under Preservation & Collection