Tag Archives: cloud-based data

Amazon Re:Invent – With the Cloud, Avoid Mistakes of the Past

Last week, I had the opportunity to attend the Amazon Re:Invent conference in Las Vegas. Over 13,000 people took over the Palazzo for deep dive technical sessions to learn how to harness the power of Amazon Web Services (AWS). reinventThis show had a much different energy than other enterprise software conferences, such as VMworld.  Whereas most conferences feature a great deal of selling and marketing by the host, Amazon Re:Invent was truly more of a training show. Cloud architects spent a lot of time in technical bootcamps learning how AWS works and getting certified as administrators.

That is not to say that there was no selling or marketing going on; the exhibition hall featured myriad vendors that augment or assist with AWS deployments and solutions. The focus on the deep technical details, though, does point out the fact that we are still in the very early days of the cloud. Most of the focus of the keynotes was about getting compute workloads to the cloud – there was not a lot of mention of moving actual data to the cloud, even though that is certainly beginning to happen.  But, that is how the evolution goes. IT departments need to be comfortable moving workloads to the cloud as they begin to leverage the cloud. Building this foundation is also important to Amazon – the goal would be for many companies to completely outsource the IT data center.

It is important, however, to proactive plan for information management as more workloads and, importantly, data move to the cloud.  As the internet first emerged, companies dove into new technologies like email and network file shares only to create eDiscovery nightmares and make it virtually impossible to find information within digital landfills. It is key to learn from those mistakes rather than to repeat them when leveraging cloud-based technologies. In order to ensure both that end-users are happy with search experiences on data in the cloud and that Legal can do what they need to do from an eDiscovery standpoint. This means providing business workers with unified access to email, files, and SharePoint information regardless of where the data lives. It also means giving Legal teams fast search queries and collections. But, Cloud search is slow, as indexes live far from the information. This results in frustrated workers and Legal teams afraid that eDiscovery cannot be completed in time.

If a customer wanted to speed up search, it would have to essentially attach an appliance to a hot-air balloon and send it up to the Cloud provider so that the customer’s index could live on that appliance (or farm of appliances) in the Cloud providers data center, physically near the data. There are many reasons, however, that a Cloud provider would not allow a customer to do that:

  • Long install process
  • Challenging Pre-requisites
  • 3rd party installation concerns
  • Physical access
  • Specific hardware requirements
  • They only scale vertically

The solution to a faster search is a cloud-deployable search application, such as X1 Rapid Discovery. This creates a win-win for Cloud providers and customers alike. As enterprises move more and more information to the Cloud, it will be important to think about workers’ experiences with Cloud systems – and search is one of those user experiences that, if it is a bad one, can really negatively affect a project and cause user revolt. eDiscovery is also a major concern – I’ve worked with organizations that moved data to the cloud before planning how they would handle eDiscovery. That left Legal teams to clean up messes, or more realistically, just deal with the messes. By thinking about these issues before moving data to the cloud, it is possible to avoid these painful occurrences and leverage the cloud without headaches. At X1, we look forward to working closely with Amazon to help customers have the search and eDiscovery solutions they need as more and more data goes to AWS.

Leave a comment

Filed under Cloud Data, eDiscovery & Compliance, Enterprise eDiscovery, Enterprise Search, Hybrid Search, Information Access, Information Governance, Information Management

Cloud Search Is Important, But Only A Piece Of The Enterprise Search Puzzle

by Barry Murphy

In an earlier post, I described the importance of having the ability to quickly search for information stored in the Cloud.  The post pointed out that Cloud search is somewhat more complicated than one might think at first glance because the speed of search is affected by how close the index lives to the actual data in the Cloud infrastructure.  One comment I received was that Cloud search can be fast and simple if the Cloud vendor promises a certain service level for query times and results.  That can address part of the issue around search (although IaaS providers – what we are truly talking about when we say “Cloud” – are typically not interested in guaranteeing SLAs for things like search because they allow customers to provision their infrastructure set to enable fast search with products like X1 Rapid Discovery).  Even if a Cloud vendor were to guarantee phenomenal search SLAs, the issue of unified enterprise search of all information still remains.

The reality is that enterprises and government agencies store information in “hybrid” environments that encompass on-premise systems within corporate data centers, virtualized systems that companies operate, and Cloud-based repositories.  Research firm Gartner predicts that by 2017, half of mainstream enterprises will have a hybrid cloud.  And, research from NetApp shows that organizations will be managing data across multiple cloud environments, not just a single provider, per se.

Click image to enlarge

Click image to enlarge

These are exciting developments.  As organizations embrace more modern infrastructures, there are many benefits to be had.  What we need to remember, however, is that business professionals still need to quickly find and take action on their information assets to do their jobs.  As that information gets further scattered, enterprise search will take on increased importance.  Workers don’t care if their data is stored on-premise or in the Cloud as long as they can quickly find it in an easy-to-use interface.

The challenge for today’s organizations is that information now lives in multiple infrastructures – on-premise, virtual, Cloud, or most frequently, a hybrid of all of these.  Current approaches to including Cloud-based data in enterprise search and eDiscovery require downloading a copy of the data to search so that it resides alongside other local content.  Unfortunately, that defeats the purpose of storing the data in the Cloud in the first place.

This takes me back to my original point:  Cloud search is very important.  But, Cloud search cannot simply exist in a vacuum.  An effective enterprise search solution will combine on-premise search capabilities that can talk to search in the Cloud – without requiring downloading the cloud-based information in order to search across all data.

Leave a comment

Filed under Cloud Data, Enterprise Search

eDiscovery Search and Collection in the Cloud

After several dozen posts on social media eDiscovery, we are going to focus the next few weeks on the related issue of eDiscovery in the cloud. As we see it, despite the enormous cost benefits of the cloud, concerns about the feasibility of eDiscovery and general search across an organization’s critical cloud-resident data has to some degree prevented broader adoption.

The cloud means many things to many people, but I believe the real eDiscovery action (and pain point) is in Infrastructure as a Service (IaaS) cloud deployments (such as the Amazon cloud, Rackspace, or pure enterprise cloud providers such as Fujitsu). According to a recent PwC report, Cloud IaaS will account for 30% of IT expenditures by 2014.  IaaS currently provides the means for organizations to aggressively store and virtualize their enterprise data and software, thus potentially spawning the same large data volumes and requiring the same critical search and eDiscovery requirements as traditional enterprise environments.  Amazon Web Services, the leading IaaS cloud provider, reports in our discussions with them extensive customer eDiscovery requirements that are currently addressed by inefficient and manual means.  So for purposes of this discussion, IaaS, which is essentially cloud for the enterprise and where there is a current significant eDiscovery challenge, is what we will focus on.

So if an organization maintains two terabytes of documents in the Amazon or Rackspace cloud, how do they quickly access, search, triage and collect that data in its existing cloud environment if a critical eDiscovery or compliance search requirement suddenly arises? This scenario is a current significant pain point for IaaS cloud.  In such situations, the organization is typically resorting to one of two agonizingly inefficient processes. The first option involves shipping the provider hard drives for their IT staff to copy the data in bulk for download and having that data shipped back. Rackspace’s guidelines provide that a transfer of 2 terabytes of bulk files would cost over $10,000 in fees and require about four to six weeks. And then all the company gets is a full 2 terabyte duplicate of its data that still must be searched, processed and reviewed.

The other alternative is to slowly download the data through a secure file transfer protocol connection. However, even with a robust T2 line, it would take three to six weeks to transfer the two TBs, depending on how much dedicated bandwidth IT would be willing to dedicate to the exercise.

So what is needed is robust eDiscovery software that can truly support the IaaS cloud where the data resides without first requiring mass data export. We will discuss what that entails and the requirements of truly cloud capable eDiscovery software in our next post, so please stay tuned!

Leave a comment

Filed under Cloud Data, IaaS

The Future for eDiscovery: Social Media and the Cloud

Greetings and welcome to all. This is the inaugural post of Next Generation eDiscovery, a blog that will focus on legal, technical and compliance issues related to the collection, preservation and early case assessment of social media and other cloud-based data. To provide some context, the team here at X1 Discovery is experienced in developing and supporting technology for collecting electronic evidence in the enterprise to meet eDiscovery and investigation requirements. Many of us hail from Guidance Software, the developer of EnCase, which is the leading eDiscovery and investigative solution for collecting from hard drives, both standalone and within the enterprise. And now we turn our focus to current trends and the future.

And the future for eDiscovery is about social media and the cloud. In fact, it seems like just this year when social media became a compelling issue in eDiscovery and is reaching critical mass given the level of rising discourse. With over 700 million Facebook users and 200 million people with Twitter accounts, evidence from social media sites can be relevant to just about every litigation dispute and investigation matter. Social media evidence is widely discoverable and generally not subject to privacy constraints when established to be relevant to a case, particularly when that data is held by a party to litigation or even a key witness.

It seems like in recent months there has been much talk in the eDiscovery and digital investigation fields about social media, mostly outlining the scope of the problem and the need to put corporate policies and procedures in place concerning social media.  That discussion is an important first step, but it’s time for actual solutions in terms of technical, legal and investigation techniques. This blog will seek to identify and foster discussion points, educate, and even pontificate at times but also learn from our readers, customers and non-customers alike. We look forward to the dialogue.

Leave a comment

Filed under Preservation & Collection