Over the past couple of years vendors and customers have both figured out that storage for Virtual Desktop Infrastructure (VDI) is a different beast. Hopefully gone are the days where designs call to co-mingle VDI with your other storage needs, unless you are just providing a small amount of desktops. There are a lot of storage options available today and many are specifically marketed to the VDI market. In this article, I’d like to share my thoughts and real world experiences regarding virtual desktop infrastructure, storage types, and performance methods.
My discussion will be generic on purpose. I don’t intend to sway anyone’s decision for or against any particular storage vendor. My aim is to educate people on what they should be considering when purchasing or evaluating storage for virtual desktops.
Classic enterprise storage: These are the big iron types or large enterprise storage arrays that are typically a general purpose type of storage. They might be able to serve up multiple storage protocols such as fiber channel, iSCSI, NFS, and CIFS. They typically would combine group disks into raid groups, and more recently pool disks into similar types and create tiers of disks.
Hybrid storage: This type of storage was introduced within the past couple of years. It offers a mix of flash storage (SSD) and slower SATA disks. But these typically do it in a more modern way than the enterprise arrays are able to do.
Flash storage: An all-flash storage array is typically an array that was designed for all flash and thus its operating system was created to take full advantage of the performance characteristics that SSD offers. While you could in theory purchase an enterprise array and only put flash drives in it, this would not be equal to the modern arrays that were designed for flash from the ground up.
Caching: All of the storage types that I discussed do some method of caching, and some offer more than one in the same array. Some things to think about when it comes to caching: Does the array caching both read and write or is it just read cache? This will help you evaluate how and for what this cache layer might be used. Also, is the cache global or can you turn it on and off for specific LUNs or groups of storage? These are both important things to consider when designing and sizing storage for VDI.
Storage tiering: This typically is found in the classic enterprise storage arrays. It’s the practice of creating pools or raid groups of disks that offer a specific performance option. For example, I might create a pool of 15K fiber channel disks and use this as a performance tier. By doing this I will manually place my workloads here that I want to have this level of performance. Then I might create a capacity tier that is made up of SATA drives. This is not a horrible method, but it requires you to deeply understand your requirements so that you both satisfy the capacity and performance needs of your workload. This also offers level of predictability because you can count on the known performance of a group of disks and not have to reply on the promises of modern storage techniques that can be at times a mystery.
Automated storage tiering: This method can be done very differently between the modern hybrid/all flash arrays vs the enterprise arrays. The idea of auto tiering is that data would be moved between performance tiers depending on if it required more or less performance. Thus giving data that needs performance access to smaller disks that offer high performance while slow workloads could fall down to the large slow performing disks. But the ways that the different storage types do this is very different.
While I believe that you can successfully design any of the types of storage that I talked about here for VDI, you must fully understand your needs to make sure the storage will deliver and not fail. The needs of VDI can at times be heavy on reads and then be almost all writes. It will also be very spiky in its demand for I/O, meaning that a desktop might be using 15 IOPs one minute, and then a user does something more advanced and requires a few hundred IOPs. So you must be able to design a solution that will both provide you capacity and I/O, but also deal with the fluctuating needs that this type of workload will demand.
To finally get back to the title of this post, I cannot emphasize enough to look at how a storage array will cache and promote data between tiers. Since VDI performance can rapidly change I have seen that you cannot rely on an array that does not make caching or promotion of data in a real-time fashion. This means that if you have to wait minutes or hours for data to be evaluated for promotion to a better performing data tier, you are likely to miss your performance need. This means that your data will be stuck with the tier of disk that it resides on at the time of the increased demand.
This does not mean that the performance will suck, that will all depend on the design and number of disks in the storage. But the promise is often that you can build capacity with slower disks and supplement with fewer fast disks. This works well for many traditional server workloads, but it just does not fly in the VDI world.
To wrap things up, I recommend that you compare your leading option with other offerings, and if you are dealing with a vendor directly, talk to your preferred partner for a sanity check. Also it never hurts to seek feedback from other customers ask for references of similar use cases.