New HD in a RAID 10 - array rebuild time

Hello,

I have a HP ML150 G6 E5504 HP SAS/SATA Server - and a hard drive failed.  After purchasing a new drive and placing it in the server, I am continuing to get the message:
"1770-Slot 4 Drive Array resuming Automatic Data Recovery process"

I placed this drive in on Friday morning - yesterday, and am still seeing this same error message when I reboot the server.  The array of 4 drives only provides a total of approximately 1 TB of storage.

Is there anything I can do to troubleshoot this?
dan_chAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

bill1965Commented:
As an estimating "rule of thumb" I typically use the a rebuild rate of 10 MBps (mega Bytes).  So if you have a 4 member R10 that results in 1,000 GB of storage capacity it would have 4 x 500 GB drives.  500 GB * 1000 MB-per-GB / 10 MBps / 60 sec-per-min / 60 min-per-hr / 24 hr-per-day gives a result of approx 0.57 days or more than half a day.

Granted this is just a rule of thumb, but I've seen this hold true across many different disk types (parallel SCSI, SAS, SATA, fiber channel).

Good luck in your rebuild.
0
Gary CaseRetiredCommented:
As noted above, it's reasonable for a rebuild to take 12-24 hours.    And if the server is in active use, that can be a good bit longer, as it will give priority to servicing network requests rather than the rebuild.    I wouldn't panic until it's been perhaps 3 days.
0
dan_chAuthor Commented:
Thanks for the comments.  I'll check again tomorrow and Monday,  
Do you agree, if the drive was incompatible that I would see different errors.  At least that is what I found on HP's site.
0
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

Gary CaseRetiredCommented:
I would expect a different (and more descriptive) error if the drive wasn't compatible -- but I don't know that for a fact.

Hopefully the next time you look at it, the rebuild will have been completed and all will be well :-)
0
DavidPresidentCommented:
Recovery efforts that take this long are the result of large numbers of recoverable read errors.  The controller is in a recover-at-all-cost mode doing huge number of retries.  

If the hardware was healthy, it would move on after 2-3 seconds max and extrapolate the data from the other disks in the set. It doesn't have the luxury this time.

Let it run.  Also consider that your other disks also have errors, and that not only are you currently in extreme risk of data loss, but if the disks are near or out of warranty, then you should replace all of them because at least one other disks's days are numbered.
0
andyalderSaggar maker's bottom knockerCommented:
Why are you rebooting it? You can check the progress of rebuild online with the ACU and get a report (which will confirm whether there are read errors on the remaining disks) with the ADU, both are available on HP's site for Windows and Linux. Don't paste ADU report into body of thread but we can read through through it if you paste as an attachment.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
dan_chAuthor Commented:
Thanks for the comment on the ACU and ADU.  I just downloaded the ACU, which I believe includes the diagnostics inside of it.  It appears to be sitting at 56% in the rebuild cycle.  I'll check this later and also run the diags and see what comes up.  
As stated earlier in this thread, the array can take a long time to rebuild.  It just seems to be a lot longer than I would expect, which could be pointing to other issues!?!
0
andyalderSaggar maker's bottom knockerCommented:
The diags is just a report, so won't harm to run at any time.
0
dan_chAuthor Commented:
Attached is a text version of the report.  It looks to me like it is simply just rebuilding.  But maybe there is something more in here that I'm not understanding.
ADUReport.txt
0
dan_chAuthor Commented:
It is now at 57% now...so it is incrementing which is good.  Just slow!
0
Gary CaseRetiredCommented:
Yes, as long as it's making progress, just be patient :-)
0
DavidPresidentCommented:
P.S. Since the system is booted to a host O/S, the rebuild is competing with system I/O.   Every task/process you can shut down will make the rebuild run a little faster as the controller won't have to constantly service application I/Os at the same time.
0
andyalderSaggar maker's bottom knockerCommented:
2I:1:2 is non-HP firmware so we can't always see proper stats for it.

Cache battery is not present, that slows rebuilding down especially if it is doing user I/O at the same time.

Rebuild priority is medium, you can change that to high (or simply stop the user I/O as dlethe said).
0
dan_chAuthor Commented:
Thanks andyalder....this is what I needed.  It always wondered if there was a way to see the health/status of the drives.  The array finished rebuilding overnight!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Server Hardware

From novice to tech pro — start learning today.