We help IT Professionals succeed at work.

Microsoft SQL Server 2008 R2 Sizing - Not sure which is more importaint...

We are looking to build a new SQL 2008 R2 server for our office. Currently we are running Windows 2003 with a SQL 2000 Server.

Currently the system is a HP DL380 G5
Dual Xeon 2.8Ghz
4GB RAM
2x 36GB SAS drives

We are planning to upgrade the RAM with another 16GB, leaving the CPU's in tact.

The database is primarily a read database- reports pulled from the data. Would expect approx 20-30 people on it at any given time- this is an estimate.

The System has 8 total hard drive slots- 2 being taken by the 36GB drives
Upon looking at the quick spec's from HP we can go with 15K RPM 147GB SAS drives (largest possible in 15K) or 10K 300GB SAS drives (largest possible in 10K).

Total drive on the SQL 2000 server is 410GB, with 220GB free. All logs, DB, etc sit on the D:\ drive. The C:\ drive only holds the OS and Application, and is using around 15GB total.

My question is this...
Which storage config would be more recommended:
6x 147GB 15K RPM in RAID 5
6x 147GB 15K RPM in RAID 10
6x 300GB 10K RPM in RAID 5
6x 300GB 10K RPM in RAID 10

On those storage configs can I partition out separate drives for Logs, DB, etc, even if they are all on the same RAID LUN, does that even matter? I am sure the database will only grow in physical size, although I would assume it will always be a more "Read-centric" database.

Any help would be appreciated!!
Thanks
Comment
Watch Question

Senior DBA
Most Valuable Expert 2018
Distinguished Expert 2019
Commented:
Do you have any disk-related performance issues now?

If not, you can try a mixed approach.

2x in RAID 1
4x in RAID 5

The logs (and possibly tempdb) can go on the RAID 1.

The data files go on RAID 5.

If at all possible, you don't want the data and log from the same db on the same drive.  Because, if the drive fails, the db is totally, and you must go to a backup.  [Although with read-only dbs this is much less of an issue than it is normally.]

This gives you the most space for data while improving reliablity and performance for the logs.
use the 15k drives unless database size is going to overrun your available space.  Plan for growth.
If this is read-only or a great deal of reads and only a few writes, you won't need much for logs, as Scott said, RAID1 for logs and just leave the database in simple recovery mode so the log files stay small.  Master and TempDB need to be on the RAID5 so that logins, index sorts and queries that do a lot of sorting don't take a performance hit.  Model and MSDB can most likely sit on the RAID1 as there won't be much activity UNLESS you are replicating.

Personally, I would consider doing two separate 3x147 GB RAID5's (unless space is an issue) and placing the logs and indexes on one and data on the other.  This will spread your load nicely and provide fast read times.  With more disks I would go RAID10 (0+1) for the logs, but given your configuration, RAID5's will work fine and provide acceptable reliability.

one note for you, CPU parallelism will not be maximized in your configuration.  You'll find that 2 or 3 cores will handle all of the disk I/O and the rest will sit mostly quiescent.  To solve this, you need more RAID volumes and to create additional data files on those volumes to get to the optimal 1 file per filegroup per core ratio.

Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
CORRECTION -- missing word:
Because, if the drive fails, the db is totally GONE/DESTROYED, and you must go to a backup.
Scott,

I agree with the concept of keeping logs and data on separate drives, but for a different reason.  You are pretty much screwed if your data drive fails, regardless of whether the logs were on that drive or not.  

Assuming you are using RAID, in a perfect world, the reason to separate the two is that the RAID levels should be different because of the characteristics of the files.  Logs should be on RAID 10 for speed and redundancy and the data/index files should be on RAID 5 for speed and reliability at the best cost/value ratio.  Yes, RAID 10 is as fast as RAID 5 with better reliability characteristics, but it comes at a significantly higher cost.  For most businesses, RAID 5 is a more than acceptable risk for the cost, especially when you start carving up a separate RAID for each physical file to get better I/O throughput.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
>> You are pretty much screwed if your data drive fails, regardless of whether the logs were on that drive or not. <<

That's not true, assuming you have proper log backups.  You can restore the db backup and re-apply the log files.  Usually you can even do a "tail" backup on the final log and apply that as well.

RAID 1/10 is better for logs because they are almost all writes.  Sure RAID 10 is preferred, but RAID 1 is much less expensive and can be very useful for smaller set ups.

Reads typically perform better on RAID 5, often better even than RAID 10.  And RAID 5 is of course vastly cheaper than RAID 10.

I think we both agree that with the given # of drives here, RAID 10 is just not realistic for the data files.

I still think 2 3x is not optimum for this situation.  You are then forced to use RAID 5 for logs, which is not good.  If you slow down log writes in SQL, you *guarantee* slowing down the whole instance, since SQL always waits for log writes to complete.  Yes, disk caching and write-ahead helps with that, but the cache is not typically very large on smaller drives, so I think logs on RAID 5 here would very likely end up causing delays in SQL.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
Also, since this instance is primarly reads, I feel it is better to give 4 spindles to the data drives rather than 3.  Every spindle helps with speed on RAID 5.
Scott,

We aren't going to agree on this so let's chalk it up to creative differences.  I was giving 6 spindles, not 4 to data.  Note that I was advocating placing the indexes on the volume with the logs.  That allows the fast reads of indexes separate from the process of writing to the data files.

>>If you slow down log writes in SQL, you *guarantee* slowing down the whole instance, since SQL always waits for log writes to complete.

RAID 5 is faster than RAID 1.  Even on older controllers where you could only write to one drive at a time, they were equal in speed.


"Performance
Raid 0 = Fastest, performance increases with drive count
Raid 1 = Possible boost in speed for READ, no gain to possible loss (if noticeable) elsewhere
 Raid 10 = Similar performance to its equivalent drive Raid 0 (ie 3 Drive Raid0=6 drive Raid 10)
 Raid 5 = Between Raid 0 and single drive performance, goes up with drive count. processor intensive
 
With these statements, it seems to me that on a price vs capacity+performance comparison, Raid 5 comes out as the winner, the only fault being the extra processing power required (on software raid at least) for parity information on writes. otherwise, for the price of a single extra drive, one can have a similar level of fault tolerance, 66% of the available capacity (this also increases as drive count goes up, 75% available in a 4 drive array etc.) and descent(sic) read/write performance as compared to a 2 drive Raid 1 array with only 50% available capacity. "
 
http://www.tomshardware.com/forum/244377-32-raid-raid-raid

Given that the processor is overpowered for the application (note my statement above about only seeing usage on 2-3 cores), RAID 5 has all of the advantages and none of the disadvantages.
>>Because, if the drive fails, the db is totally GONE/DESTROYED, and you must go to a backup.
Whether the logs are on a separate disk or not, you are still going to backup if you've lost the drive holding the data files.  That was what my statement was in reference to.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
>> RAID 5 is faster than RAID 1. <<

NOT for writes.  RAID 5 is much slower for writes.

>> I was giving 6 spindles, not 4 to data. <<
No, the logs used 3 spindles, and the data 3.


>>  Note that I was advocating placing the indexes on the volume with the logs. <<
That may gain you some performance, but it definitely kills the recoverability.



>> Whether the logs are on a separate disk or not, you are still going to backup if you've lost the drive holding the data files.  That was what my statement was in reference to. <<

That's the point of log backups, so you can recover past the time of the last full db backup.

If you want the max possible recovery in the event of a data drive faiulre, you simply must put the log files on a separate drive.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
Your link is one forum post by one person.  It's hardly authoritative.

I think any reasonable search for *expert* comments will confirm that RAID 1 is much faster at writes than RAID 5, and common sense tells anyone that logs are extremely heavily write-oriented.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
Back to original q,

forgot one thing that may apply, depending on the specific drives you get.

If you can specify specific % of cache for read and write, naturally the data drives should have a higher read% of cache, and the log drives should a very high % of cache dedicated to writes.


Also, put the db backups on the log drive.  Those drives write faster AND it provides better recoverability.  If the data files are lost, you must have the db backup to recover.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
Here's one link of many:

http://www.sql-server-performance.com/2006/raid-1-raid-5/

Of course if many more spindles are added to RAID5, you might have to go to RAID10 to maintain better write performance, but with a lot of spindles that would be a viable option.
Logs are sequential writes, not random writes.  Even those sources which say RAID 1 is faster, such as http://www.pcguide.com/ref/hdd/perf/raid/levels/comp-c.html only give a slight edge to raid 1 over RAID 5 in sequential writes.  Newer controllers change the metrics considerably with their ability to multiplex in a single controller.   IBM recommends 4+1 RAID 5 for DB2 files for this exact reason, their controllers multiplex to 4 drives simultaneously.

As I said, we aren't going to agree, chalk it up to creative difference.

Scott PletcherSenior DBA
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
The writes are really only effectively sequential on a *dedicated* drive for each db.  Multiple "sequential" writes coming in simultaneously to the different files -- as you have in SQL Server -- and the writes become de facto random.  And if you could dedicate drives to each log, I'd still use RAID1 -- you don't need 4 drives for log drives for one db.

In the situation we are discussing, the ONLY way to have two RAID 5 sets is for each to be *three* drives, as you indeed recommended.  That provides none of the real benefits of RAID 5 and loses the potential benefits of RAID 1.