This is a guest blog by Eric Burgener, Research Director, Storage at IDC
The build out of 3rd Platform computing has driven the emergence of new storage architectures. These architectures, including all-flash arrays (AFAs) and hyperconverged infrastructure (HCI), were needed to address a number of issues. Issues regarding legacy storage infrastructure abound. Performance, dat growth (and the associated ease of expansion requirements), administrative productivity, reliability and efficiency (in terms of energy and floor space consumption) are among the top concerns. For many enterprises, legacy applications like relational databases, messaging and collaboration platforms, and file shares must continue to be supported even as next generation applications (NGAs) are hosted on the same consolidated virtual infrastructure. The markets driven by these requirements are already quite large – and by 2019 IDC expects AFAs and HCI to generate revenues of about $5.5 billion and $4 billion respectively. This growth has all occurred in the span of 5-6 years since the products in these markets were first introduced.
For the next decade, IDC expects 3rd Platform computing to dominate IT infrastructure decisions. NGAs in the areas of mobile computing, social media, big data/analytics and cloud are opening up significant growth opportunities for forward-thinking businesses as they pursue new customers and new markets with new services that could not have existed in the recent past. One of the defining features of NGAs is scale: these applications easily require millions of IOPS and work with extremely large data sets, driving the need for massive bandwidth and capacities in the petabyte (PB) range and beyond. Many NGAs must accommodate massive data ingest on a worldwide scale with latencies under what AFAs can deliver today. Many new customer and market opportunities are based on real-time analytics that can quickly turn this data into market intelligence that drives differentiating value. This value is worth a premium.
Big data/analytics promises to bring new and unexpected insights to innovators, developers and marketers, and will change the way businesses market and sell their products. Businesses must be able to collect and handle data sets that are orders of magnitude larger than what they have dealt with in the past. Speed is of the essence in analyzing and leveraging opportunities which may be transient or simply not available with conventional analytics. Within just a few short years, businesses that are not heavily leveraging real-time analytics and have the IT infrastructure flexibility to rapidly respond to the opportunities such analytics technology uncovers will be at a significant competitive disadvantage. Those businesses that understand how the shift to 3rd Platform computing has driven the AFA and HCI markets, should clearly understand how the future of big data/analytics is already driving another storage infrastructure shift.
To handle the real-time analytics requirements in evolving big data repositories, organizations have tried to use AFAs. These systems, however, were designed to use much smaller data sets and have a limited ability to accommodate the increasingly massive data sets of this emerging era. In particular, AFAs have difficulty ingesting new data while at the same time performing the real-time analytics these customers are looking for. As a result, AFAs used in these types of environments require significant manual labor – the workloads must be partitioned and spread across multiple systems, many times with multiple copies of these partitioned data sets. Multiple copies are needed to provide the performance necessary to meet the SLAs of different applications, but this leads to an inefficient use of storage capacity. AFAs also lack the bandwidth to deal with the extract, transformation and load (ETL), and decision support requirements of these “data at scale” environments. As a result, analysts and administrators spend a lot of time trying to tune systems that basically lack the ability to handle this kind of scale.
The emerging storage architectures designed to deal with these requirements will likely offer several key technology differentiators. First, to deliver the performance (in terms of both latencies and throughput) at scale, the host connection between the servers and the arrays must evolve to support consistent latencies in the sub 100 microsecond range. An obvious solution to this problem: extend internal server busses to accommodate shared storage. Second, the system should be built around the use of memory-based storage media without requirements to meet any type of spinning disk compatibilities. Emerging memory technologies offer huge opportunities to improve reliability, lower energy consumption, and improve storage density when not fettered with legacy compatibility issues – all characteristics particularly important at scale. Third, the platform needs to accommodate multiple data types – structured, unstructured and semi-structured – simultaneously and natively without inefficiencies. To maximize the ability of businesses to leverage data to identify opportunities, they will need to effectively use all types of data without any preference between them. And fourth, we need to move away from today’s relatively heavy weight I/O stacks to ones that are specifically developed for use with this new system architecture. Many of the data analytics applications are custom-written, and the existence of an API to take advantage of this leaner (and much lower latency) I/O stack would offer significant advantages to developers looking to optimize the performance, reliability and efficiency of the storage system.
IDC is starting to see evidence of the emergence of next generation storage architectures designed specifically to deal with the problems of data analytics at scale. Given the size of the big data/analytics market in the coming years, the related storage infrastructure spend is likely to be much larger over time than what we seen thus far from the emerging storage architectures. 2016 will prove to be an interesting year as these big data-oriented storage solutions start to be unveiled and made generally available.
This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.