It goes without saying that 2020 was an unforgettable year. It was unforgettable in a different way for the major cloud service providers, all of which experienced an impressive surge in demand. Market leader AWS closed out 2020 with revenues of $45.3 billion, up nearly 30% year-over-year and more than $13.5 billion in annual operating profits—which is 63% of Amazon’s total operating profits for the year.
Roughly 50 percent of all corporate data is stored in the cloud, according to Statista. Storing data in a cloud service eliminates the need to purchase and maintain data storage infrastructure, since infrastructure resides within the data centers of the cloud IaaS provider and is owned and managed by the provider. Beyond cost savings, cloud storage provides valuable flexibility for data management. IT organizations are increasing data storage investments in the cloud for backups and data replication, data tiering and archiving, data lakes for artificial intelligence (AI) and business intelligence (BI) projects, and to reduce their physical data center footprint.
Just as with on-premises storage, in the cloud, you can purchase different levels of storage based on whether the data is hot (accessed frequently) or cold. This way, you are not overpaying for storing data which is needed only for archives or for very occasional access. You can use data management solutions to set up policies and automatically move data to the right cloud storage class based on parameters such as age, owner and cost.
The leading use case for cloud storage today is handling the petabytes of unstructured data that enterprises are amassing: file data from many different applications such as genomic sequencing, electric cars, bodycam videos, Internet of things (IoT), seismic analysis and collaboration tools. Migrating file data to the cloud is hard because it can take a long time and
entails unique requirements regarding access controls and security. Depending on the type and volume of data you wish to move to the cloud, you will need to adjust your strategies appropriately.
Here are considerations as you evolve your cloud data management strategy to avoid getting burned on cost and performance:
· Secondary storage tier gotchas. Enterprise IT organizations are increasingly seeing the value of the cloud as a secondary or tertiary storage tier because it frees up space on expensive on-premises storage and allows you to leverage the cloud for AI and analytics. However, it's easy to get burned when a storage vendor writes data to the cloud in a proprietary format. Data in non-native format must be read through the vendor’s application before use, making it difficult for other applications to use. As well, in some cases the data must be rehydrated to the source and then moved before use. Ensure that you understand the limitations of moving your data to the cloud and if it's in a format that is acceptable to common use cases.
· Managing shadow IT. It’s true: shadow IT is no longer a dirty word. But opening up the cloud to your workforce without guard rails can get messy quickly. Conversely, by creating a well-defined strategy and data governance process for the cloud, you can minimize the negative effects of shadow IT while still allowing employees to experiment safely with approved apps and services.
· A worsening problem of data islands. The cloud, for all its merits, has added data silos – made even more scattered by the multi-cloud movement. Clouds have different storage classes and tiers for file and object storage, all of which need to be leveraged for a cost-effective file data management strategy. These result in more silos to manage. Regardless, hybrid IT is here to stay for most midsize to large enterprises and it means that IT leaders need to determine how to get a central view and management plane for data and assets. This doesn’t mean that you need to store all the data in
one place, but you will need visibility to move data and workloads around as needed based on cost, performance and/or business requirements.
· Hidden costs. The challenges of cloud sprawl and VM sprawl have been known for quite some time. Moving to the cloud requires constant oversight to ensure that you aren’t wasting money with unused or ill-used resources. Another issue, however, is making sure that file data is managed and tiered appropriately; don’t manage cold and hot data the same way or you will take it in the nose with nasty egress fees and unnecessary API costs. A large government agency was recently in the news for spending millions of dollars on egress fees as the data they moved to the cloud was in fact accessed frequently: Ouch. Understand your data, and all the areas where the cloud can bite you. Be sure to talk to your IT vendors about these risks and how to avoid them.
· Skills, skills! Yes, the talent gap remains large in technology, so IT leaders must always factor this into the equation when making dramatic changes in strategy. A recent CompTIA survey found that 74% of large firms will be hiring for IT and technology roles in 2021, with a particular focus on advanced infrastructure, AI and data science, and people skills for remote collaboration.
· Unrealistic expectations for savings. Over the long haul, an organization can easily save on cloud storage versus maintaining a lot of technology inside the corporate data center. But this requires a well-defined data strategy. It’s better to think about the benefits of moving from a Capex to a more predictable Opex spending model without the hidden intangible expenses that occur from traditional IT. As you optimize cloud infrastructure, you won’t have to worry about expensive hardware sitting in your data center, cooling costs, regular fire drills and the hassle of maintaining and securing everything.
Thinking for the long term
There is untold value in the massive amounts of unstructured data which organizations are storing; some estimates report only 1 to 2% of this data is actually being used. Have the necessary conversations with your vendors, consultants and in-house stakeholders to clearly understand all of your data assets: where it resides, who’s using it and how often, and its strategic value to the organization. By gathering this information, you will be able to make informed decisions about your data and where it should live. These decisions will evolve with business needs, so ensure that you have the means to continually analyze your assets and adjust your strategy as needed.