SharePoint Planning: Understanding SQL Disaster Recovery Options

A SharePoint implementation can be a huge endeavor, and many SharePoint projects are doomed to failure despite the best intentions of an organization due to a lack of planning. It really will help your project to take some time up front and plan out your environment. I recently wrote about how to incorporate some SQL high availability into your SharePoint planning. And while high availability is great, it doesn’t give you a free pass to skip the planning phase for what happens in case of a disaster.

This article will help you to understand some of your options to consider for disaster recovery for your SQL servers. Understanding these methods will give you some of the information you’ll need to be prepared in case you need to recover your data.

Disaster Recovery and Storage Redundancy: Too Much of a Good Thing?

Storage redundancy is the first and best defense against data failure. Good storage redundancy can keep you running by adding some resiliency and high availability to your infrastructure. It can also really help in disaster recovery situations.

You’ll definitely want to take opportunities to improve your availability by using database mirroring when you can, but you should also be looking at options like RAID 10 on your Storage Area Network. You’ll also need to keep a backup of your databases. Of course, you’ll want to balance the amount of database backup copies that you keep. Tailor your approach to suit the needs of your data and your business.

The storage redundancy you use, both in terms of keeping the actual data on multiple disks and the number of backups you keep, will increase your ability to get your data back and reduce both the time it takes to get back up and running and the amount of data that is lost in the process. But all that storage redundancy has costs associated with it.

Beyond storage redundancy, you should understand the process of shipping transaction logs between SQL servers as a way of adding additional SQL backups to your environment.

Use Backup and Copy Jobs to Ship Transaction Logs

Different than the high availability options such as SQL database mirroring and Always-On, keeping a backup server that is able to take over in case of a critical failure can be of huge benefit if and when it’s needed.

One common way to keep that backup SQL instance ready to take over is to use transaction log shipping to copy your data from your primary SQL instance to your backup SQL server. In some cases you can also use it to keep multiple backup servers online.

All of this is possible because of the transaction logs. The data stored in your SQL databases was created in steps. Adding, updating and deleting records all took place, usually a few updates at a time, to get that database to its current state. The transaction logs are the record of all of those steps that made changes to the database. Using the transaction logs, you can completely reconstruct the database to its current state, basically by retracing the steps.

This process involves using both a backup job for creating the transaction log, and then a copy job for moving the transaction log onto one or more additional servers. Once the transaction logs are moved onto those additional servers, the transaction logs can be used to restore the database on the backup instances.

Transaction Log Shipping SQL troubleshooting

This won’t give you a “hot standby” like you get with SQL database mirroring and a witness or an Always-On scenario. But if you’re in a position where you need to rebuild, you’ll be glad that you’ve been shipping your transaction logs to a backup server or file share ahead of the catastrophe.

Migrate SQL Logins to the Backup Servers

Keeping your transaction logs available on a shared storage location or keeping multiple copies of your database backups is obviously a huge part of the picture, but don’t overlook another important piece of the puzzle. For somebody to access those databases once they’ve been restored, a SQL login will have to exist on that server so that they can login and have the correct permissions.

You’ll want to ensure that your SQL logins are kept in sync between any of your backup SQL Servers that may somebody be required to take over for your primary server. One option to synchronize your logins between SQL instances would be a full SQL instance backup and restore, but this is not something that is sustainable. The time and I/O that it would take makes it a bad method to do with any regularity.

The better option is to use a PowerShell script to automate the processing of SQL logins across the SQL instances. Setup your PowerShell script or workflow to run on a nightly basis and you’ll be able to keep your logins in sync. If they’re needed, they’ll be ready and waiting for you on the backup servers.

More Resources

It would be impossible to give you a full understanding of all of your options in just one article, but you should now have a much better grasp of the difference between disaster recovery and high availability, and know what some of your options and choices are.

When designing your environment, you’ll need to consider how to keep the environment running in case of a SQL server or disk failure. But more than that, you need to know what you’re going to do in case the high availability options do not work and multiple disk failures in the same RAID take out the shared storage of your Always-On cluster.

Here are some additional resources for planning and implementing SQL disaster recovery into your overall SharePoint plan.