Link to home
Start Free TrialLog in
Avatar of treminga
treminga

asked on

Microsoft SQL Server 2008 R2 Sizing - Not sure which is more importaint...

We are looking to build a new SQL 2008 R2 server for our office. Currently we are running Windows 2003 with a SQL 2000 Server.

Currently the system is a HP DL380 G5
Dual Xeon 2.8Ghz
4GB RAM
2x 36GB SAS drives

We are planning to upgrade the RAM with another 16GB, leaving the CPU's in tact.

The database is primarily a read database- reports pulled from the data. Would expect approx 20-30 people on it at any given time- this is an estimate.

The System has 8 total hard drive slots- 2 being taken by the 36GB drives
Upon looking at the quick spec's from HP we can go with 15K RPM 147GB SAS drives (largest possible in 15K) or 10K 300GB SAS drives (largest possible in 10K).

Total drive on the SQL 2000 server is 410GB, with 220GB free. All logs, DB, etc sit on the D:\ drive. The C:\ drive only holds the OS and Application, and is using around 15GB total.

My question is this...
Which storage config would be more recommended:
6x 147GB 15K RPM in RAID 5
6x 147GB 15K RPM in RAID 10
6x 300GB 10K RPM in RAID 5
6x 300GB 10K RPM in RAID 10

On those storage configs can I partition out separate drives for Logs, DB, etc, even if they are all on the same RAID LUN, does that even matter? I am sure the database will only grow in physical size, although I would assume it will always be a more "Read-centric" database.

Any help would be appreciated!!
Thanks
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
CORRECTION -- missing word:
Because, if the drive fails, the db is totally GONE/DESTROYED, and you must go to a backup.
Scott,

I agree with the concept of keeping logs and data on separate drives, but for a different reason.  You are pretty much screwed if your data drive fails, regardless of whether the logs were on that drive or not.  

Assuming you are using RAID, in a perfect world, the reason to separate the two is that the RAID levels should be different because of the characteristics of the files.  Logs should be on RAID 10 for speed and redundancy and the data/index files should be on RAID 5 for speed and reliability at the best cost/value ratio.  Yes, RAID 10 is as fast as RAID 5 with better reliability characteristics, but it comes at a significantly higher cost.  For most businesses, RAID 5 is a more than acceptable risk for the cost, especially when you start carving up a separate RAID for each physical file to get better I/O throughput.
>> You are pretty much screwed if your data drive fails, regardless of whether the logs were on that drive or not. <<

That's not true, assuming you have proper log backups.  You can restore the db backup and re-apply the log files.  Usually you can even do a "tail" backup on the final log and apply that as well.

RAID 1/10 is better for logs because they are almost all writes.  Sure RAID 10 is preferred, but RAID 1 is much less expensive and can be very useful for smaller set ups.

Reads typically perform better on RAID 5, often better even than RAID 10.  And RAID 5 is of course vastly cheaper than RAID 10.

I think we both agree that with the given # of drives here, RAID 10 is just not realistic for the data files.

I still think 2 3x is not optimum for this situation.  You are then forced to use RAID 5 for logs, which is not good.  If you slow down log writes in SQL, you *guarantee* slowing down the whole instance, since SQL always waits for log writes to complete.  Yes, disk caching and write-ahead helps with that, but the cache is not typically very large on smaller drives, so I think logs on RAID 5 here would very likely end up causing delays in SQL.
Also, since this instance is primarly reads, I feel it is better to give 4 spindles to the data drives rather than 3.  Every spindle helps with speed on RAID 5.
Scott,

We aren't going to agree on this so let's chalk it up to creative differences.  I was giving 6 spindles, not 4 to data.  Note that I was advocating placing the indexes on the volume with the logs.  That allows the fast reads of indexes separate from the process of writing to the data files.

>>If you slow down log writes in SQL, you *guarantee* slowing down the whole instance, since SQL always waits for log writes to complete.

RAID 5 is faster than RAID 1.  Even on older controllers where you could only write to one drive at a time, they were equal in speed.


"Performance
Raid 0 = Fastest, performance increases with drive count
Raid 1 = Possible boost in speed for READ, no gain to possible loss (if noticeable) elsewhere
 Raid 10 = Similar performance to its equivalent drive Raid 0 (ie 3 Drive Raid0=6 drive Raid 10)
 Raid 5 = Between Raid 0 and single drive performance, goes up with drive count. processor intensive
 
With these statements, it seems to me that on a price vs capacity+performance comparison, Raid 5 comes out as the winner, the only fault being the extra processing power required (on software raid at least) for parity information on writes. otherwise, for the price of a single extra drive, one can have a similar level of fault tolerance, 66% of the available capacity (this also increases as drive count goes up, 75% available in a 4 drive array etc.) and descent(sic) read/write performance as compared to a 2 drive Raid 1 array with only 50% available capacity. "
 
http://www.tomshardware.com/forum/244377-32-raid-raid-raid

Given that the processor is overpowered for the application (note my statement above about only seeing usage on 2-3 cores), RAID 5 has all of the advantages and none of the disadvantages.
>>Because, if the drive fails, the db is totally GONE/DESTROYED, and you must go to a backup.
Whether the logs are on a separate disk or not, you are still going to backup if you've lost the drive holding the data files.  That was what my statement was in reference to.
>> RAID 5 is faster than RAID 1. <<

NOT for writes.  RAID 5 is much slower for writes.

>> I was giving 6 spindles, not 4 to data. <<
No, the logs used 3 spindles, and the data 3.


>>  Note that I was advocating placing the indexes on the volume with the logs. <<
That may gain you some performance, but it definitely kills the recoverability.



>> Whether the logs are on a separate disk or not, you are still going to backup if you've lost the drive holding the data files.  That was what my statement was in reference to. <<

That's the point of log backups, so you can recover past the time of the last full db backup.

If you want the max possible recovery in the event of a data drive faiulre, you simply must put the log files on a separate drive.
Your link is one forum post by one person.  It's hardly authoritative.

I think any reasonable search for *expert* comments will confirm that RAID 1 is much faster at writes than RAID 5, and common sense tells anyone that logs are extremely heavily write-oriented.
Back to original q,

forgot one thing that may apply, depending on the specific drives you get.

If you can specify specific % of cache for read and write, naturally the data drives should have a higher read% of cache, and the log drives should a very high % of cache dedicated to writes.


Also, put the db backups on the log drive.  Those drives write faster AND it provides better recoverability.  If the data files are lost, you must have the db backup to recover.
Here's one link of many:

http://www.sql-server-performance.com/2006/raid-1-raid-5/

Of course if many more spindles are added to RAID5, you might have to go to RAID10 to maintain better write performance, but with a lot of spindles that would be a viable option.
Logs are sequential writes, not random writes.  Even those sources which say RAID 1 is faster, such as http://www.pcguide.com/ref/hdd/perf/raid/levels/comp-c.html only give a slight edge to raid 1 over RAID 5 in sequential writes.  Newer controllers change the metrics considerably with their ability to multiplex in a single controller.   IBM recommends 4+1 RAID 5 for DB2 files for this exact reason, their controllers multiplex to 4 drives simultaneously.

As I said, we aren't going to agree, chalk it up to creative difference.

The writes are really only effectively sequential on a *dedicated* drive for each db.  Multiple "sequential" writes coming in simultaneously to the different files -- as you have in SQL Server -- and the writes become de facto random.  And if you could dedicate drives to each log, I'd still use RAID1 -- you don't need 4 drives for log drives for one db.

In the situation we are discussing, the ONLY way to have two RAID 5 sets is for each to be *three* drives, as you indeed recommended.  That provides none of the real benefits of RAID 5 and loses the potential benefits of RAID 1.