The Benefits of Creating Multiple Storage Groups, Part 1

When it comes to a product like Exchange Server, there are recommended best practices for practically everything. One particular area where I have seen a lot of contradictory recommendations is in regard to storage group and database usage. That being the case, I want to take the opportunity to explore this issue a bit from a practicality standpoint.

Basic Architecture

Most Exchange administrators are probably already familiar with storage group architecture, but I wanted to quickly go over the basics, just to make sure that we are all on the same page. Essentially, every database must be placed in a storage group. You can however, place multiple databases into a single storage group (the exact number depends on the version and edition of Exchange). The transaction logs for the database are bound to the storage group. Therefore, if a storage group contains multiple databases, then all of those databases will share a common set of transaction logs.

Design Considerations

Now that I have quickly gone over the basics, I want to talk about design considerations. Some of the materials that I have read say that you should place all of your mailboxes into a single database if possible. Other resources will tell you that if you have more than a few dozen mailboxes, then those mailboxes should be distributed among multiple databases. Likewise, some resources will tell you that each database should reside in its own storage group, while other resources tout the benefits of loading up storage groups with multiple databases.

When it comes to the question of whether you should place all of your mailboxes into one database or distribute them across multiple databases, I tend to think that the correct answer really depends on the individual situation. Of course in some situations you may not have a choice. The sheer number of mailboxes may require you to create multiple mailbox databases.

For the sake of this article though, let’s assume that you only have a couple hundred mailboxes, which is well within the range of what a mailbox database can comfortably handle. Should you put all of those mailboxes in one database, or spread them across multiple databases, and should those databases all be within the same storage group, or should each database reside in its own storage group.

The first thing that I would recommend considering are the hardware requirements.  Microsoft actually lays out guidelines for the maximum number of storage groups on a mailbox server, based on the amount of RAM that is installed in the server. The following table lists the recommendations:

Amount of RAM Installed Recommended Maximum Number of Storage Groups
2 GB 2
4 GB 8
8 GB 24
12 GB 40
16+ GB 50

Another issue that you must consider is that of the disk subsystem. As I said before, all of the databases within a storage group share a common set of transaction logs. Microsoft recommends that the transaction logs be stored on separate disks from the databases for both performance and fault tolerant reasons.

Let’s pretend that you decided to create two separate databases and you placed those databases into two separate storage groups. There is technically nothing stopping you from putting both databases on one volume, and both sets of transaction logs on another volume. Even so, this type of design would completely undermine two of the most important benefits that are associated with dividing the mailboxes into multiple databases.

One of those benefits is performance. Spreading out your users among multiple database results in fewer users in each database, meaning that the server doesn’t have to work as hard to sustain those users. If you group all of your databases onto one volume, and all of your transaction logs onto another volume though, you have not done anything to reduce the I/O requirements on those volumes. If anything, you might have actually hurt the server’s performance because the server now has to deal with the overhead caused by having multiple storage groups.

The other benefit that is undermined by this type of design is that normally if you separate your users into multiple databases, and one database fails, then only the users whose mailboxes reside in that database are affected. Everyone else can go on working as though nothing has happened. If you group databases together and transaction logs together in the way that I described above though, then a disk failure would affect everyone, even though not all of the mailboxes are in the same database.

Conclusion

I have only just begun to explore the issues involved in database and storage group placement. I will continue my discussion in Part two.