How to Design a Backup Strategy in SQL Server
Designing a backup strategy is fairly straightforward, but it can get complicated when you’ve got a lot of moving pieces in your scenario. Here we’ll discuss some of the factors you’ll want to take into consideration when designing your strategy as well as some good practices. Notice I stayed away from the phrase ‘best practices’. This is because best practices only give you a point from which you should start the discussion. Too often best practices are blindly followed, and that’s very dangerous.
For starters, best practices are never published with an explanation of perspective. There’s rarely anything that holds true 100% of the time. For example, if “best practices” say to always do full backups once a day followed by log backups every hour, that’s pretty specific. But what “best” are we talking about? Best for backups, for recovery, for server resources? And so on. So here we’ll stay away from ‘best practice’ recommendations like that and just give you a couple high-level discussion points so you’ll be better prepared to make your own decisions.
Let’s first talk about backing up user and system databases. It’s always a good idea to back up system databases separately from user databases. The reason is, should any of the user database backups fail, the system database backups may also fail. Your entire installation relies on your system databases, so you don’t want them to fail. Set up a job to backup the system databases, and another job (or set of jobs) to backup the user databases.
Another reason to have separate backup jobs for system and user databases is that you may want to back them up at different times, or you may want to backup the system databases more often. In most shops I’ve been in, I’ve backed up the system databases once a day (it really doesn’t matter when), but there have been times I’ve felt more comfortable doing it twice a day if I had an environment that changed things often.
Backup User Databases
Say Goodbye to Traditional PC Lifecycle Management
Traditional IT tools, including Microsoft SCCM, Ghost Solution Suite, and KACE, often require considerable custom configurations by T3 technicians (an expensive and often elusive IT resource) to enable management of a hybrid onsite + remote workforce. In many cases, even with the best resources, organizations are finding that these on-premise tools simply cannot support remote endpoints consistently and reliably due to infrastructure limitations.
Backing up user databases is fairly straightforward. Your biggest decision is going to be how much data you can afford to lose. Some databases have hardly any updates and the data is easy to re-enter, so they might only be backed up once a week. And there are others that must be backed up every 5 minutes.
So let’s say for the sake of argument that you’ve decided that you can afford to lose at most an hour of work. Now you’ve got to decide how you’re going to run your hourly backups. Doing a full backup every hour usually isn’t an option because databases are usually too big for you to store that much backup data on disk. So you’re left with 2 choices: log backups and differential backups.
Log backups will be your most common choice, because they’re easy and they’re typically pretty small. Log backups will usually be your choice if your database is more transaction-based. That is to say, if you are doing straightforward data entry, and not loading data into the database in bulk – from text files or the like. The reason is that bulk loads aren’t fully logged; they’re what we call minimally logged.
Minimally logged means that the details about each transaction aren’t recorded in the log, only the page allocations are logged. This is a performance enhancement for bulk loads because when you’re inputting millions of rows you don’t want to log each one individually. Logging each row in a bulk load serves no purpose but to fill up the log and slow down the load. So if you don’t have any bulk loads then log backups are a good choice for you.
Differential backups – or “diffs” – are what you’ll use if you have bulk load operations. Unlike log backups, diffs don’t work off of the log. Diffs work by reading a special page in the database that tracks extent (a group of 8 pages) changes. Whenever you load data in bulk it marks down in this special page (called the DCM – Differential Change Map) which extents have had data changes to them. Then the diff backup process just looks at that page for which extents need to be backed up.
That’s straightforward enough, but there’s one more situation where you’ll want to consider taking diff backups. If you’ve got a really busy system, and your requirements are such that you can’t lose any more than 10 minutes of data, then you may consider using diffs. Now I know what you’re thinking… I just said you would use logs for that. And that’s true, you certainly can and you’ll do just fine. But now we’re coming at it from a recovery perspective (see… a different “best practice” already!)
When you restore a database and then recover the log backups, you have to roll through each log backup individually. You can still code it, but they all have to be restored in chronological order, and that can take quite a bit of time. So in our current scenario (only losing 10 minutes of work), that would put you taking a full backup once a day and taking a log backup every 10 minutes. That’s in the neighborhood of 140 log files you’ll have to roll through during a restore. However, differential backups are cumulative. If you take a full backup every day, and then take a diff backup every hour, you only have to restore the full backup and the LAST diff backup.
Like I said, they’re cumulative: each one contains all of the data that the ones previously did. This is because that special page I talked about above (the DCM) is only reset by the full backup process. To reset it and make your diff files smaller, then take a full backup first.
A complete scenario
You may have noticed above that there’s a big problem. We’ve specified that we can only lose 10 minutes of work, but we’re only taking diff backups every hour. Well, the good news is that differential and log backups work very well together. Let’s build a more complete strategy now: You can take a full backup every day, then diffs every hour, and then log backups every 10 minutes. This way if your database fails at 1:20, then you’ve only got to restore a full backup, the last differential backup and 2 log backups – instead of the dozens of log backups you’d have to restore without the diffs.
Here’s a summary of recommendations:
- Backup system databases and user databases separately.
- Take full backups at times when system activity is at its lowest.
- Daily full backups are pretty standard and a good place to start.
- Use log backups if you don’t have any bulk loads.
- Use differential backups if you have minimally-logged bulk load operations like BCP or select into.
- Use a combination of diffs and log backups if you need to speed recovery time.
Ok, this was a high-level discussion the considerations you need to set up your own backup strategy. I didn’t cover physical implementation, but I do cover basic backups in this piece here.
And remember, ALWAYS question best practices. Don’t ever implement one until you understand whether it’s coming from a performance perspective, or recoverability, ease of management, ease of coding, etc. Because best practices change with the scenario, so what could easily be deemed a best practice in one scenario could actually be a worst practice in another.