Link to home
Start Free TrialLog in
Avatar of jimmylew52
jimmylew52Flag for United States of America

asked on

Benchmark problem on Dell R710

I have a Dell R710 with dual quad core processors and 32 gig of ram that I am trying to run MS SQL server on 2003R2 OS.

When I run a read benchmark test it runs at almost 2,000 MB/s for about 10 seconds and them drops off to under 100 MB/s for all bu about 15 seconds of the test, then it returns to over 2,000 MB/sec.
I have spent several days with Dell tech support but have not been able to find the answer. They say it works and none of the diagnostics show a problem so there is nothing they can due. The server is useless to us as it is.

Anyone have any ideas what the problem might be?
Avatar of Aaron Tomosky
Aaron Tomosky
Flag of United States of America image

What's the configuration of the drives? Any antivirus or something running?
Avatar of jimmylew52

ASKER

3 ea 300 gig drives in a raid 5 configuration. No antivirus running, nothing running but MS SQL.
What sort of transfer rates are you expecting?  100MB/s doesn't seem too far out of line once you are reading from the physical disks.  The 2,000MB/sec is probably when you are reading from the disk cache.
If I run the same benchmark test on two other identical servers I get a reasonably steady 230 MB/s through out the benchmark test. Other servers I have give similar reasonably steady results, depending on disk speed, 15,000 rpm disks give a higher value than 10,000 rpm disks.

None of them start very high drop off low and then spike again at the end.
That's what I'm thinking. Raid5 has terrible iops for a database, usually the same as a single drive so 100MB/a sounds right. Can do you raid 10?
No Raid 10 would not leave enough space on the server to hold the data bases.

I have never heard of Raid 5 being bad for running a database. I have always heard to run multiple small drives to increase the number of spindles being accessed by the data base.
Number one issue with SQL and Server 2003 is disk alignment.

See MS for best practices:

http://technet.microsoft.com/en-us/library/dd758814%28v=sql.100%29.aspx
Interesting link. I will check the drives and see what I find.

I still wonder why only one of three identical machines has this problem even after multiple drive wipes and installs of the OS and sql.
Have you tried swapping the RAID drive sets between the machines and repeating the benchmark?  This would pin down whether the difference is in the drives/installation or in the rest of the system.  I'd be most suspicious of the RAID settings, especially anything having to do with caching.
I have tried swapping the drives between the systems. The problem does not follow the drives.  All three machines were setup using the Dell Server Assist CD.
ASKER CERTIFIED SOLUTION
Avatar of Aaron Tomosky
Aaron Tomosky
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I'll look at that. Any idea what I am looking for?
at a high level, anything different between a "good" server and the slow server.

could be a read/write cache setting.
What's the RAID controller?
Perc H700 is the raid controller.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Matching the raid controller setting of the problem server to on that is not having the problem has not had any positive effect. They were almost identical anyway.

Setting an all three are write back.

I'll see if I can get a spare controller to test with. Two of the three are in production.
Thought: If the slow one is set to write back but the battery is low does that revert to write through automatically? I'm not families with this specific raid card but I've seen things work this way in other situations.
If the slow one is set to write back but the battery is low does that revert to write through automatically?

Not sure but the battery is fine. I have never seen this before either. Neither has any one else I have talked to.
If the battery is dead,it usually turns the cache off.
A new controller has been ordered and is expected to be delivered today. I should have testing completed in a couple of days.
The new controller solved the problem!!!!!!!!

Had to rebuild the RAID and reinstall with the new controller but it is fixed and usable again.

The diagnostics did not show a problem but the raid controller WAS the problem.
I appreciate all the time you spent trying to help me trouble shoot the problem.
it's always nice to have extra gear around to swap. Sometime it's the only way. Good job figuring it out.