Avoiding the pitfalls of cloud storage failure

With so much choice in the storage industry, it’s easy to be confused by the pros and cons of various implementations. Whether that be the choice between physical or virtual infrastructures, hybrid cloud or fully remote, or even hard disks versus solid state. What is key is that any storage solution implementation carries an appropriate balance of cost, growth and risk. By Gavin McLaughlin, VP of Worldwide Marketing at X-IO.

  • 9 years ago Posted in

This is important for many businesses, but is especially critical for cloud or managed service providers who provide a platform for multiple organisations. In these cases, an outage or performance issue can have devastating results, not only for a provider’s reputation but also for its bottom line.


This reality has hit home for customers of Dimension Data in Australia (see http://forums.theregister.co.uk/forum/1/2014/07/04/dimension_data_in_cloud_outage/ ). It’s hard to know exactly what went on but Data Dimension has been quite open in admitting it suffered an outage on its EMC storage implementation. The result was no service to customers for more than 24 hours – OUCH!


Sadly this is an all too common occurrence for storage architectures being implemented in “cloud” data centres, but it doesn’t have to be this way. The two most common causes of storage failures in enterprise data centres are:
1. Human error (e.g. knocked cable, wrong controller rebooted, wrong drive pulled);
2. Drive failure, either a RAID rebuild or multiple failures causing outage.


Both of these scenarios are entirely avoidable through the realisation of true zero-touch storage. The storage industry has done a fantastic job of conditioning storage buyers and administrators into believing hard disk failure and subsequent replacement is entirely acceptable and poses no risk. This couldn’t be further from the truth.
I myself have worked (many, many years ago) as a storage field engineer and I’ve seen, heard and (I have to confess) been involved in horror stories involving either human error or multiple drive failures resulting in outage and/or data loss. It doesn’t have to be this way.


A trend is emerging of all flash array vendors arguing these issues can easily be solved by moving to an all SSD/flash architecture, but they forget to mention the same problem can easily occur again. Don’t believe the hype that says “there’s no moving parts so there’s nothing to fail”. The truth is all drives have the ability to fail, spinning or non-spinning. The only way to avoid drive failure is to have the ability to repair drives in-situ with no impact on the workload.


The ideal storage for cloud providers is something that is:
1. Truly zero-touch. Many cloud and managed service providers have remote or even third-party data centres – wouldn’t it be nice if they never had to go near a storage array?
2. Consistent. Storage should give consistent performance and reliability regardless of its utilisation. It should give the same performance at 99% capacity utilisation as it does at 1%.
3. Scalable. You shouldn’t have to buy a 500 disk monster array upfront to get predictable performance, neither should you have to suffer when you add an extra shelf of disks.
4. Commercially viable. At the end of the day, it’s key predictability and reliability doesn’t come a cost that breaks the business model of the service provider.
The good news for cloud providers is that it’s already here. It’s true that not all storage is created equal but that doesn’t mean the right storage for cloud providers doesn’t exist. All they have to do is take the time to look for it.
 

Exos X20 and IronWolf Pro 20TB CMR-based HDDs help organizations maximize the value of data.
Quest Software has signed a definitive agreement with Clearlake Capital Group, L.P. (together with...
Infinidat has achieved significant milestones in an aggressive expansion of its channel...
Collaboration will safeguard HPC storage systems and customer data with Panasas hardware-based...
Peraton, a leading mission capability integrator and transformative enterprise IT provider, has...
Helping customers plan for software failure, data loss and downtime.
Cloud Computing and Disaster Recovery specialist, virtualDCS has been named as the first UK-based...
SharePlex 10.1.2 enables customers to move data in near real-time to MySQL and PostgreSQL.