Link to home
Start Free TrialLog in
Avatar of Tore Jacobsen
Tore JacobsenFlag for Norway

asked on

slow server after raid crash fix.

Hi.
Have a HP ML350 G6 server With SBS 2011 installed.
Server have a Raid 5 With 6 drives.
Found one disk failled an 2 more predictive failure (!)
Hot swaped one disk at the time starting With the failed and leting it rebuild over night before swaping the Next.
After that it booted fine and worked well for a day.
After a day it became very slow. And would not boot until we change the cach battery..
Now, server is up but is very slow.
In aplication log it gives event id 823 (want m to run dbcc checkdb on sbsmonitoring.mdf)

What is the best stratigy to find what's wrong and to fix this.
Backup have been runing but gues it's something wrong\ files corrupted during the unstable disk situation that will be on the backups as well..

H
SOLUTION
Avatar of warddhooghe
warddhooghe
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of David
David
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Tore Jacobsen

ASKER

Hi.
Guess I was not Clear regarding the drives. All 3 has been replaced. First the failed and then the predictive ones. One day apart so the raid could rebuild inbetween.
Have checked that the raid is fully rebuild (Hp Raid config utility)
as long as the rebuild is 100% I very much doubt the slowdown is caused by these failures.
you might want to run some diags, check disc queue length, fragmentation, etc.

I also recommend upgrading all drivers and firmwares, since you have HP it can be done very easy, just download and install the latest HP SPP. Ofcourse not to be done during office hours. Will need reboot and best to execute on the ILO or server concole, etc.
The slowdown is also further slowed down by the certainty that you have a large number of recoverable read errors.  Depending on the make/model of drive, it can take up to 10 seconds to get just one unreadable block. Those SMART errors on several drives indicate you have a statistical certainty that surviving disks have these errors.

By any chance are these consumer disks, or disks w/o the HP firmware?  If so, then you are destined to have more of the same problems.
I am installing the SP for Proliant now. Takes hours (sloooow server)
Drives are HP.
After SPP is Complete, plan to restart, but assume it will be just as slow..
Any other suggestions?
goto the support.hp.com site and download HP's ACU program and look at the event log to see what is happening.

There is no reason to have to guess when HP spends millions developing a program that tells you exactly what is going on.
Needed to run chkdsk -r
Was working fine after that
I've requested that this question be closed as follows:

Accepted answer: 0 points for TelehusetMoss's comment #a39695601

for the following reason:

found solution
So what was the solution, especially in light of the fact that the array was rebuilding at the time as I pointed out, so of course it will be slow until it completes.
Some credit in directing you to the solution?
Agreed, points should be awarded equally, unless the problem is still there and the array is no longer rebuilding.
3) http:#39644852 and http:#39644930
Both comments advised running various diagnostics, which in the end fixed the issue as the owner claimed "chkdsk -r" was the solution.
The chkdsk would have made no difference. The rraid was already doing a full media repair.  Time too complette the repairr was the only thong necessRy. Has chkdsk note have been run, it. would haave completed sooner.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Raid was fully rebuilt.
Only after chkdsk /r was it usable.