The dust of VMworld 2014 hasn’t even settled down yet and while I was following the event via Twitter and blogs, I’ve realized how glad I’m to work in the field of virtualization and especially in the area of storage. Why you may ask? Because I think the storage market is one of the most interesting areas in IT to work in. There are so many different solutions and approaches out there, all trying to solve the problems and challenges that come along with the need of storing your most critical business assets, your data. More and more startups are coming out of stealth to challenge the big players like EMC while those are not standing still.
It’s quite a challenge to keep pace with the evolving market and to stay up to date with about all those vendors and solutions, but it’s a fun one! That’s why I thought it might be a good idea to provide those of you who are not following the market as closely with a quick walkthrough across today’s major approaches and solutions.
Let’s start with the most basic solution, traditional “Block Storage” which still makes up a big portion of the market. They come many different varieties, ranging from rather stupid block storage like Dell’s MD3, HP’s MSA or Nexsan’s E-Series up to very well known solutions like NetApp’s FAS series or EMC’s VNX. Those include way more intelligence and also an own layer of abstraction to offer features like Thin-Provisioning, Storage Tiering, De-Duplication, etc. Over time they have built up a rich ecosystem, especially around data protection solutions which makes them still very attractive for many customers. Those arrays usually scale up by adding more disks or by replacing the existing dual controller configuration with faster ones.
More about storage architectures can be found on Chad’s blog.
However the range in this category is pretty big, from entry level systems up to really advanced enterprise grade solutions. Just for the overview I’ve put them into a single category.
No doubt that customers will still go for them as long as other solution I’m about to cover are still in the process scaling down own to SMB customers.
Combined with a “Storage Hypervisor” to abstract the provided capacity and performance like DataCore’s SANsymphony-V or FalconStor they can be even more attractive to SMB customers. These software solutions can act as hypervisor for your storage, which enables customers to turn all sorts of storage units into a smart solution with features like Thin-Provisioning, Storage Tiering, etc. The software layer abstracts the provided capacity and pools it together. This allows spreading your data across multiple physical storage arrays and so to accelerate performance. They even compete with solutions like EMC VPLEX because this software layer allows you to create stretched cluster across sites. And because those solutions are not tied to any hardware, they can also run as virtual machine to create your own virtual SAN/hyper-converged solution. But this usually goes hand in hand with additional complexity due to the multiple layers that are required to build such a solution.
This problem has been targeted by (native) “Hyper-Converged” solutions which combine computing resources (CPU & RAM) with local storage. They eliminate the central SAN as component of your datacenter by scaling it out across all computing (hypervisor) nodes. In my opinion especially VMware’s own VSAN technology will play an important role in the SMB market. For SME up to Enterprise customers they are now in a race with the leader Nutanix and SimpliVity which are on the market for quite a while now and they have made their self already a name. That’s why VMware just announced an OEM program called EVO:RAIL, which combines VMware vSphere (VSAN + vCenter Server) with a new interface to simplify the deployment. The solution will be provided by partners like Dell, Supermicro and even EMC which all want a piece of the booming hyper-converged cake. You may ask why this market is going crazy? Simply because those solutions allow you to scale-out pretty easy and setup efforts have been reduced to a minimum. However the implementations vary, especially in terms of scalability and features like data reduction, just keep that in mind.
All those solution have one thing in common, FLASH! No matter which approach it is, all of them integrate Flash in some form to accelerate storage performance. No matter if as read cache in entry level SANs, as storage Tier or as Flash first approach in hyper-converged solutions.
A hyper-converged solution wouldn’t even be possible without Flash, due to the limited amount of physical disks available to each node. To achieve a reasonable level of performance they use Flash devices as read & write cache or also as storage tier. Even I haven’t used a solution other than VSAN myself; I see a potential performance bottleneck because of the limited number of SSDs/disks per node.
This leads me directly to the so called “Hybrid-Arrays” which combine Flash and spinning disks in a central storage array. A good example that comes to my mind is Nimble Storage. Due to the intelligence they’ve put into their arrays, they can even use slower capacity disks like 7.2k SATA drives by optimizing the incoming I/Os to be sequentially written to down. And this is where spinning disks really shine. Flash in this case is used only as read cache. Other approaches like EMC’s FastCache in the VNX series can use SSDs also as write cache. This in my opinion can be more efficient than a classic Storage Tiering approach, simply because hot data will be way faster on flash, basically when it’s needed and not after a scheduled data movement. And as you can see, some arrays can be found in multiple categories since they have evolved overtime. Another example would be X-IO’s ISE arrays.
This brings me to one of my favorite arrays out there, the “All Flash Arrays” (AFA) which are packed with just Flash storage (usually SSDs of the shelve) to provide even more IOPS. And excuse me if I’m a bit rude here, but a legacy block storage packed with SSDs in my opinion is not an AFA! Simply because the price per SSDs is often disproportionately expensive and a bunch of SSDs can easily drive the storage controllers to their maximum. A real AFA should offer a non-blocking architecture, data reduction technologies like de-duplication and compression and should have an efficient implementation of RAID technologies optimized for flash as well. So I talk about those specifically build for Flash, like Pure Storage’s FA-400 or EMC’s XtremeI/O series. Gartner recently published a new Magic Quadrant for AFA currently led by those two vendors. In my opinion closely followed by SolidFire. Those arrays (usually also dual controller + SSD shelves) can be used to run dedicated workloads which require high IOPS and low latency or also as a storage tier within a Storage Tiering concept. One thing that’s not quite obvious, whereas Pure is following a scale-up approach, EMC and SolidFire are scale-out architectures. I can only recommend the “Tech Field Day” videos on YouTube to get a better understanding of their technologies. Because of the current pricing and the rather low capacity I don’t see them to be the only storage within your data center in the near future.
Last but not least a completely different approach, the “Server Side Caching”. The idea behind this is to use local Flash devices or even RAM inside the host/hypervisor and to transparently intercept I/Os in their path down to the storage array to be cached on those local Flash devices. Even if these solutions from PernixData or others like SANDisk and Infinio don’t provide any persistent storage capacity, they have some really big benefits. The most obvious is the performance, because all accelerated VMs see real SSD latency no matter how the actual SAN looks like. You are free to choose which SSDs or Flash devices you buy to accelerate your application performance. The application performance will be decoupled from the actual SAN performance, which makes the decision for a new SAN somewhat easier since you don’t need to be concerned about the performance. And these solutions allow to easily scale-out as you are adding new hosts.
There would be so much more to say about the individual solutions, but this post should only give you an overview of the current storage market. One thing that maybe becomes clear after reading this post is that there is no final answer to the question which approach is the holy grail of the storage technology. Therefore customer sizes, requirements, budgets, etc., are way too different to give an absolute answer. And in my opinion this is a positive thing, to have so many solutions to choose from!
One last thing. I’ve covered the major approaches and mentioned just a bunch of vendors in this post. No doubt that there much more interesting solutions like Dell and their Compellent, HP’s 3Par or even more startups like Coho Data. Maybe I’ll find a way to provide you with an overview of all of them.