As SE responsible for the DACH region, I get in contact with quite a lot of different customers of all sizes and I for sure can confirm, that one thing absolutely stands out, which is the widespread of synchronously mirrored storage systems in this region.
While this type of design offers a couple of valid advantages, like high availability across “sites” (if done right), it serves most enterprises as the foundation of their Business Continuity strategy.
But at the same time it also introduces a couple of challenges, because it’s not just about the storage alone, it is also about proper networking, firewalling & routing, resource management etc. But that’s actually enough content to write a book about it, that’s why I’ll dive deeper into the storage side of things for now.
There are two big challenges associated with a such design on the storage side, performance and economics. As you can imagine, both are tightly coupled.
I admit there is no way to get around the storage virtualization layer, but I contend there is definitely room for more efficiency when it comes to the overall storage design.
No matter if going Hybrid or All-Flash, the costs on the underling hardware will double for sure (not to speak of other things like maintenance costs, power & cooling, etc.) but other areas like usable capacity or performance will not.
But is there a way to create a more efficient storage design?
Based on the feedback I get, I can also tell that that the area of the All-Flash-Datacenter is still in its early days, so a common practice is to choose a hybrid model with just a fraction of Flash compared to the total capacity. And it actually doesn’t matter if we talk about a handful of Flash devices or All-Flash-Arrays, the problem is pretty much the same.
There will always be a gap, between the summed performance the individual Flash devices added to a storage system and the net performance a storage system is able to deliver. There are various reasons for that, limited CPU resources, data services, RAID overhead for instance. And now we add even more overhead to the mix.
The first performance loss comes into play when protecting the local SSDs with some form of RAID in every site. Depending a couple of factors, like the type of array or the number of SSDs, often RAID10 is used over RAID5. So let’s get more precise. I’ll use 4 x 800GB Intel DC S3610 per site, for all of my examples.
This not only results in a performance loss, but also in just 50% or 1.6TB useable flash capacity.
Assuming a 70/30 Read/Write ratio, that would be theoretical 134.000k IOPS on paper. This of course doesn’t say anything about the latency a VM might observe at the end of the day.
Now we add the second site and the storage virtualization layer and we are still at 1.6TB usable Flash capacity due to the mirroring. At this point the write performance won’t increase, depending on the storage virtualization layer and the distribution the workloads across sites, the read performance potentially could.
But certain workloads in form of a virtual machine can only run either in site A or in site B. So in essence, if the total Flash capacity is limited, so are the number of VMs which potentially could be spread across sites to squeeze more read IOPS out of the system.
There are many more factors that could impact the latency in a negative way:
- Is there an automated Storage Tiering involved, so that parts of the workloads may get migrated to slower Tiers over time? How long will it take until a block will promoted back to Flash?
- What impact will competing workloads have on each other when running on the same flash pool?
- Which impact will a varying block size have on the overall performance?
- What is the real work read/write ratio of those workloads?
- How big is the workloads working set? Does really everything need to be placed on Flash?
- What impact will the fabric have on the overall latency?
- Is there a way to ensure data locality, so that reads don’t have to be issued across sites?
A lot of questions which all too often get answered by bigger boxes, just to make sure things turn out as expected.
STOP, yes of there is a more efficient way!
You could leverage the power of a decoupled storage design, which offers some impressive advantages compared to the drawbacks outlined above.
With PernixData FVP in the picture, any can leverage a true Scale-Out Architecture to move the Flash part a storage design as close as possible to the actual workloads, namely directly into the hypervisor. This cuts lots of corners and helps to overcome previous challenges and to use Flash resources more efficiently. Not to mention, FVP could also use Host memory instead of Flash devices, but let’s keep it simple for now.
How about 6.4 TB usable Flash capacity in the first place? This not only means to be able to accelerate way more workloads simultaneously, but also to achieve a better Performance Isolation by aligning storage with compute. With the 8 SSDs in my example, you could equip up to 8 physical hosts.
Distributing multiple Flash devices across multiple hosts, multiple CPUs, multiple controllers, etc. helps to eliminate more and more bottlenecks compared to a central shared system. And especially removes some headaches around multipathing and helps to ensure data locality within a given host.
In terms of Performance this now means that for the first time any can get close to the summed performance of the individual Flash devices. Given the SSDs linked above, this could be easily in the range of hundreds of thousand IOPS. But IOPS are not really interesting, it all about the latency a VM (application) experiences.
A server side layer also offers more choices to its users when it comes to choosing an acceleration resource. Here any can leverage latest technologies. Assume we would replace the Intel DC S3610 with an Intel DC P3600 NVMe Flash device? Of course the price tag might be a bit higher the one of the SSD, but so is the performance by every means. This is also crucial if you want to design a modern storage platform with varying block sizes in mind! This approach allows to select appropriate types of media even for challenging workloads. By the way, to figure out the actual workload behavior, to be able to make those decisions in the first place, you could simply leverage PernixData Architect.
Due to the off-loading effect FVP has on the backing storage system, no matter if mirrored of not, this also allows for a more efficient backend design by potentially leveraging more capacity oriented RAID levels. This can further reduce the number of disks with all the benefits that come along with it.
Now it’s you turn to re-think your or your customer’s future storage design!