Planning a DFS Architecture, Part 3

Note:This article is a follow up to Planning a DFS Architecture Part One and Part Two.

Although Windows Server 2008 improves upon DFS technology, DFS has been around for quite a while, and I have learned quite a bit over the years about planning for DFS replication.  I’m not talking about the replication topology itself, although that is important.  When I’m talking about are the little things that make the difference between replication performing well, and DFS running amuck.  In this article, I want to wrap up the series by sharing with you some best practices for DFS replication.

Backup Strategy

Just because the files stored on a DFS tree are being replicated to other servers does not mean that you don’t have to back them up.  Having a DFS replicas on other servers helps to protect the data against a catastrophic hard drive failure, but does nothing to protect against data corruption.  If a file were to become corrupted, the corruption would likely be replicated to the other targets.

Because the data should be identical on each DFS replica, you can usually get away with only backing up one of the replicas.  But one important thing that you need to keep in mind about the backup process though, is that it is important that you configure your backup software not to update the archive bit.  The reason for this is that file replication is triggered by a file version change, or a modified date and time stamp.  As such, there is a chance that updating the archive bit could potentially trigger a mass replication.  This doesn’t happen in every case though (or at least as it for me anyway), so you may want to experiment to see if the archive bit has any effect on your environment.

Disk Space

This one may seem obvious, but I have seen cases in which the drive containing the staging folder is either ridiculously small, or low on space.  The drive containing the staging folder has to have enough free space to accommodate the replication process.  After all, it will act as a temporary repository for replicated data that is being sent or received.

The DFS Root

There are several considerations that you should make when planning your DFS root.  I recommend starting with an empty DFS root so that you can avoid replicating any data at the root level.  The DFS root should only contain folders that are managed by DFS.

I also recommend that you avoid replicating data between DFS namespace root folders.  The reason for this is that in doing so Windows will try to replicate not only the root, but also the target folders within it.  While this may not sound like such a bad thing, keep in mind that the target folders are already replicating independently of the root in most cases.  Setting up replication at the root level does not provide a level of replication redundancy.

Decide Whether or Not Replication is Appropriate

Although DFS replication can help you distribute the client workload between multiple file servers, and provides you with a level fault tolerance using DFS replication is not always desirable.  For example, imagine an environment in which users are constantly making changes to data.  In such an environment, every update to a file would change the files version number, which would trigger DFS replication.  If an excessive number of updates are being made then it could trigger a replication storm.

Replication storms are avoidable, because Windows Server 2008 allows you to limit the amount of bandwidth that is consumed by the replication process.  The problem with limited bandwidth though, is that if DFS replication has insufficient bandwidth than replicas may not be immediately synchronized, which can lead to version conflicts.

Typically the best candidates for DFS replication or environments in which the users read a lot of data from the file servers, but do not make a lot of changes.  In these types of environments, the replication work load is minimal because replication only needs to occur when updates occur.

If your users do constantly update files then you might consider setting a replication schedule that performs the majority of the replication operations during off-peak hours.  Once again though, this can lead to version conflicts if two separate instances of a file are each updated before replication can take place.  The point is though, that before you decide on a replication strategy you really need to give some serious thought as to whether or not the strategy that you have chosen makes sense for your individual company’s needs.

Conclusion

In this article, I have talked about some things that you can do to ensure that DFS replication occurs smoothly, and without being disruptive to your network.  Keep in mind though, that these are just best practices and that there are other issues that could potentially be disruptive to the DFS replication process.

Got a question? Post it on our Windows Server 2008 forums!