Utilizing SAN Storage with Windows Failover Clusters

Storage Area Networks (SAN’s) are well suited to support clustering technologies.  As you may know, clustering is the concept of connecting several servers to the same shared disk storage.  This allows multiple servers to access the disk storage in a coordinated fashion, offering fault tolerance by avoiding a single point of failure should a server malfunction.

Clustering technology has been around for over 25 years, such as OpenVMS clusters by Digital Equipment Corporation.  Microsoft first introduced Windows clusters in NT4 with the code name Wolf Pack.  These early clusters allowed 2 servers, aka “nodes”, to be configured with parallel SCSI or SAN-based fibre cables to access shared storage.  In today’s clusters, Windows 2008 R2 allows up to 16 servers to be configured in a failover cluster with access to hundreds of terabytes worth of data.

SAN-Based Clusters

As previously discussed in the article “Exploring Windows Storage Technologies: DAS, NAS and SAN-Based Solutions”, SAN-based configurations lend themselves to shared disk access.  In the typical SAN-based cluster configuration, 2 or more servers are connected by fibre cables to SAN switches.  Multiple SAN switches are used to provide redundancy should 1 of the switches fail.  Storage controllers are also connected to the SAN switches to connect the disk arrays to the SAN.  The following diagram illustrates a typical SAN-based cluster with redundant paths.

SAN cluster with redundant paths
Diagram 1: SAN-Based Cluster with Redundant Paths

Storage Considerations

When configuring disk drives to be used in a SAN-based Windows cluster, several requirements must be followed.  Only BASIC disk types are allowed, no DYNAMIC volumes.  Also, disks must be formatted with the NTFS file system with either MBR or GPT-based disk partitions.  Finally, the shared disks must be made accessible to the cluster members by LUN masking on the storage controllers.  This establishes which storage LUNS can be accessed by which servers.

In addition to LUN masking, it is a recommended best practice to establish a unique SAN zone for each Windows cluster.  This is accomplished on the SAN switches to only allow certain servers and storage controllers to communicate in a logical zone.  By isolating the SAN I/O traffic to a particular zone, it cuts down on the traffic from other zones, thus reducing the overall I/O latencies.

Synchronizing Disk Access

As you can imagine, having multiple servers accessing the same shared storage needs to be done so in an orderly fashion.  If multiple servers try to access the same data at the same time in an uncoordinated fashion, the result is disk corruption.  There are 2 schools of thought on how to coordinate shared disk access; one is the “shared everything” model, the other is the “shared nothing” model.

The “shared everything” model allows all servers to access all the shared disk drives at the same time, simultaneously.  This is accomplished by the use of “distributed lock manager” software which coordinates the locking of files and records on disks.  Only 1 server can own an exclusive write-mode lock on a file which prevents other nodes from writing at the same time.  While there is overhead associated with the distributed lock manager, it scales well as the number of servers in the cluster grows.

In contrast, Windows clusters utilize the “shared nothing” model when synchronizing access to storage.  This means that only 1 server can own a particular shared disk drive at a time.  This prevents other nodes from writing to the disk while the owner node manipulates the data.  The other servers can own their own disk drives, but never can 2 nodes own the same drive.  You can see why this is called the shared nothing model.  What follows is a diagram illustrating the shared nothing model with Server1 owning Disk1 and Server2 owning Disk2.

Shared nothing model in Windows failover clustering
Diagram 2: “Shared Nothing Model”

Looking under the hood, the way that Windows synchronizes disk access is accomplished in 2 ways depending on the version of the operating system.  For Windows 2003 and prior, a Challenge/Defense strategy is used to ensure only 1 node owns the disk drive at a time.  This is accomplished by issuing SCSI commands to reserve, release or reset the LUN or bus.

For Windows 2008 and later, the cluster software now relies on persistent reservations maintained by the storage controllers.  Using persistent reservations is less disruptive on the SAN because there are no more SCSI bus resets as experienced with the Challenge/Defense mechanism.  However, not all storage controllers support the SCSI-3 persistent reservations so be sure to know if yours does.  What follows is a diagram illustrating the Windows 2008 disk storage architecture used with clusters.

2008 Failover Clustering and SAN
Diagram 3: Windows 2008 Disk Storage Architecture Used with Clusters

SAN’s offer very high throughput with fibre connections and provide data protection with RAID technologies.  All components in a SAN are fully redundant in case a particular device fails.  But all this comes with a cost, so be sure to know your options.  See related article on Windows Failover Cluster and iSCSI Technology which covers iSCSI-based clusters, offering similar functionality at a lower cost. Utilizing SAN’s with Windows Failover Clusters can provide a robust platform for your mission critical applications, and with Windows Server 2008, Failover Clustering has been greatly enhanced.