Run for the fifth year in a row by DDN, the survey results included input from more than 100 global end-users across a wide number of data-intensive industries. Respondents included individuals responsible for high performance computing as well as networking and storage systems from financial services, government, higher education, life sciences, manufacturing, national labs, and oil and gas organizations. As expected, the amount of data under management in these organizations continues to grow. Of organizations surveyed:
• 85 percent manage or use more than one petabyte of data storage (up 12-percentage points from last year).
Survey respondents continue to take a nuanced approach to cloud adoption. Respondents planning to leverage cloud-based storage (encompassing both private and public clouds) for at least part of their data in 2017 jumped to 48 percent, an 11-percentage point increase from 2016 survey results. Despite a more positive disposition toward cloud storage, only 5 percent of respondents anticipated more than 30 percent of their data residing in the cloud. Maybe because of the limited use, as well as the ever-improving economics of public cloud services, a full 40 percent of respondents anticipated using public cloud in some way as a solution in the coming year even if for a limited amount of data. This response compares with only 20 percent of respondents last year who said they anticipated using public cloud storage options.
While the basic application of flash storage in HPC data centers remains relatively flat, at approximately 90 percent of respondents using flash storage at some level within their data centers today, the main shift is in how much data is being retained in flash. While the vast majority of respondents (76 percent) store less than 20 percent of their data on flash media, many respondents anticipate an increase in 2018, with a quarter of respondents expecting 20-to-30 percent of their data to be flash based, and another 10 percent expecting 20-to-40 percent of their storage to be on a flash tier.
How customers are applying flash to their workflows is also particularly interesting. A majority of survey respondents (54 percent) are primarily using flash to accelerate file system metadata. There is a growing interest in using flash for application-specific data as well, with 45 percent of respondents indicating that they are using at least some of their flash storage this way. On the other end of the flash usage spectrum few customers are using flash user data, which is logical given current cost deltas between flash and spinning disk storage.
"Once again, DDN’s annual HPC Trends Survey reflects the developments we see in the wider HPC community. I/O performance is a huge bottleneck to unlocking the power of HPC applications in use today, and customers are beginning to realize that simply adding flash to the storage infrastructure isn’t delivering the anticipated application level improvements,” said Kurt Kuckein, director of marketing for DDN. “Suppliers are starting to offer architectures that include flash tiers optimized specifically for NVM and customers are actively pursuing implementations utilizing these technologies. Technologies like DDN’s IME are specifically targeted to have the most impact on accelerating I/O all the way to the application.”
I/O bottlenecks continue to be the main concern for HPC storage administrators. Especially in intensive I/O workflows like analytics, where 76 percent of customers running analytics workloads consider I/O their top challenge. Given this, it is not surprising that only 19 percent of survey participates consider existing storage technologies sufficient to scale to exascale requirements.
A majority of respondents (68 percent) view flash-native caches as the most likely technology to resolve the I/O challenge and to push HPC storage to the next level, reflecting an eight-percentage point increase versus last year’s survey. HPC storage administrators have already, or are beginning to evaluate flash-native cache technologies at a greater rate than before, with more than 60 percent of the responses indicating that they have implemented, are evaluating now, or plan to evaluate flash-native cache solutions such as NVM. Evidence of the impact of these technologies can be seen in the recent io500.org results where JCAHPC utilized a flash-based cache to achieve stellar performance and the top spot in the first annual storage I/O benchmark ranking.
As an increasing number of HPC sites move to real-world implementation of multi-site HPC collaboration, concerns about security remain at the forefront. Perhaps somewhat surprisingly, the second highest barrier to multi-site collaboration has nothing to do with technology or security. Organizational bureaucracy was identified by 43 percent of respondents as a major impediment to data sharing. This means that even though data sharing has become technically possible as well as cost effective, there are still limiting perceptions that stand in the way of wider collaboration.