Reduce VDI costs by Enabling Data Deduplication on Windows Server 2012 R2
Virtual desktop infrastructure (VDI) might not be the panacea that Gartner, Forrester and their ilk preached it to be five years ago, but VDI still has a valuable role to play. Unfortunately, VDI is very expensive, and that cost increases when we need to deploy personal (non-pooled) virtual machines. In this article, I will show you how you can reduce that cost by enabling deduplication on Windows Server 2012 R2 storage to reduce the cost of providing fast storage for this kind of VDI virtual machine.
VDI Virtual Machine Options: Shared and Dedicated
There are two kinds of virtual machines in a VDI. The most economical kind is a pooled (or shared) virtual machine. In the Microsoft world, this solution is based on a template VHDX file with virtual machines being created using differential disks that use the template as a parent. We can improve the performance of this solution by pinning the template VHDX on an SSD tier of storage. And considering that each differential disk is going to be small, we might even be able to stretch the budget a little to store them on tiered storage or on an SSD-only volume, too.
The second kind of VDI virtual machine is a personal or dedicated virtual machine. Every time a user is provided with one of these, a whole new fixed or dynamic VHDX file is created from a template. This gives the user their own, long-lasting virtual machine, but it comes with a steep price — storage is not cheap and personal VDI virtual machines consume lots of storage.
Windows Server Deduplication
Windows Server 2012 (WS2012) introduced built-in deduplication for NTFS data volumes (not system/OS volumes or ReFS). This feature proved to be excellent at optimization of storage usage of archive content, but it was not supported for production data. In Windows Server 2012 R2, Microsoft added a new feature to solve the above personal VDI storage challenge, where we can now enable deduplication for volumes that will only store personal VDI virtual machines.
The obvious benefit is that we reduce the storage cost of personal VDI virtual machines; the chunking system of deduplication will only store unique blocks once on the volume. This might allow you to reconsider the type of physical disk you will use for personal virtual machines. Maybe you can now afford to store them on SSDs because of the reduced capacity utilization?
One might think that using deduplication would have a negative effect on storage performance. And that’s a big concern for VDI engineers because of the boot storm, which is the name given to the phenomenon of large numbers of users arriving at work in the morning and trying to log in. This in turn hammers the VDI storage system and can cause long delays in the office productivity. However, Windows Server uses an in-memory cache. When chunks are read from the on-disk deduplication store when a virtual machine is starting up, they are cached in memory. Subsequent reads don’t need to touch the chunk store as much because they are intercepted by the cache — memory is much faster than disk so the effects of the boot storm are minimized.
Implementing VDI Deduplication
There are a few requirements to implement this solution:
- Compute (Hyper-V) and storage must not be converged. In other words, you are going to be using a tier of Hyper-V hosts and a tier of Windows Server storage, such as a Scale-Out File Server (SAN or Storage Spaces).
- The storage servers must be running Windows Server 2012 R2.
- The virtual machines must be personal VDI virtual machines. This is not supported for other kinds of virtual (VDI or non-VDI) virtual machines, even if it does work.
Use PowerShell to Enable Deduplication
You will implement deduplication using PowerShell and not Server Manager.
1. Enable deduplication on the file server(s). This example enables deduplication for Hyper-V (VDI only, as above) on a Cluster Shared Volume in a SOFS cluster:
Enable-DedupVolume C:\ClusterStorage\Volume1 –UsageType HyperV
2. Deduplication normally only works with aged files at rest for 3 days. We override this by running the following snippet:
Set-DedupVolume C:\ClusterStorage\Volume1 –OptimizePartialFiles
3. You then start to deploy VDI virtual machines onto the volume, as you do, you will need to manually optimize the volume, which is normally done automatically on a scheduled basis. You can run an optimization job with this PowerShell example:
Start-DedupJob C:\ClusterStorage\Volume1 –Type Optimization
4. After that, make sure you monitor the health and effectiveness of deduplication using Get-DedupVolume and Get-DedupStatus.