One of the most important tasks you can do in SQL Server is to setup your data and log files. Not getting these files setup correctly can be one of the biggest causes for production problems whether it’s disk contention, space usage, or something else. And honestly this level of planning is often overlooked and by the time the problem is discovered the application and its users have already suffered. So let’s get into some good discussion about how to setup your database files.
We’re going to start with logs first because they’re the easiest and probably the ones you’ll touch most frequently. And we need to talk about placing your log files on disk first. In general, you’ll want to place your log files on a different physical partition than your data files. This is for 2 reasons. The first is disk contention. Every transaction has to write something to the log file before it can write it to the data file and if both files are on the same disk, then the disk arm has to work twice has hard because it has to jump over here to write the log and then jump over there to write to the table. Putting a log file on its own disk is also better for the performance of the log because logs write sequentially so if the log file is on its own disk the disk arm has very little moving to do to get to the next place it needs to write so it’s much faster. The second reason you want data and log files on separate disks is for recovery. If the data partition fails you’ll want the log on a separate set of disks so you can still recover the logs and roll the last transactions forward so you don’t lose any data.
It’s a common misconception that you will get a performance gain by using several log files. This is something we see quite often and it simply isn’t true. Log files are written sequentially which means that each log file is filled up before the next one is written to. So if you have 4 log files (Log1, Log2, Log3, Log4), SQL Server 2008 will fill up Log1, then fill up Log2, etc. This is different from the way data files behave and we’ll discuss that in a minute. The only reason to have multiple log files on multiple partitions is for space. You may need more disk than a single partition can provide. Oh, and putting multiple log files on a single partition is just dumb. It gains you nothing.
Now let’s talk about where you’ll place your data files. As we’ve already said it’s a good idea to place them on a different physical partition from your log files. And we know we already said that, but experience has shown us that we could fill an entire page with that exact advice and a good portion of you still won’t grasp the importance of it. So we’re really going to hammer that point home in this article.
So while your data files are separated from your log files you may feel free to have multiple data files on multiple physical partitions to take advantage of performance gains. Now, we have to say again that these have to be physical partitions because logical partitions are still on the same physical disk and it’s the disk arm contention you’re trying to avoid. Data files, unlike log files, use what’s known as an equal fill algorithm. This means that all of the files are filled equally as much as possible. So let’s say you have 4 data files (Data1, Data2, Data3, Data4). When you write to the database, SQL Server 2008 will write to these files in a round robin fashion and they should grow at more or less the same rate. You can also place tables inside specific files so you have a good level of control over how you split up the I/O workload in your database. And while the number of files you need to optimize your workload is a subject that’s up for debate in the community, for most systems it won’t make that much difference. However, it is a good idea that no matter how many data files you decide to have, to make them all the same size. This goes back to that equal fill algorithm we were talking about before.
Another good topic is file growth. This is a mistake that many beginners make. They accept the defaults for file growth and that’s just asking for trouble. We’re going to give you some advice on how to set your file growth and you can adjust it to suit your needs, but at least you’ll know the arguments.
Here we talked about separating your data and log files onto separate partitions, as well as some of the theories on why we recommend the things we do. In the next article, How to manage SQL Server database files, we’ll talk about how to physically change your file properties to accomplish these goals. And don’t forget to separate your data and logs onto separate physical partitions.