We help IT Professionals succeed at work.

Adaptec RAID - rebuilding taking a long time

smellystudent
on
Medium Priority
1,223 Views
Last Modified: 2012-06-22
I have an Adaptec 3805 controller card in a Viglen server, with a RAID 1 array consisting of 2 150Gb SATA drives.

One of the drives was throwing up occasional read errors, so I marked it as faulty and replaced it. The new drive was recognised, and showed up in Adaptec Storage Manager with a status of 'Rebuilding'.

There was a flurry of activity on the two drives for a couple of hours, as expected, but that has now stopped and the drive appears normal: no red status light, and occasional activity in conjunction with it's partner.

However, Adaptec Storage Manage still shows it as rebuilding, with the array degraded. It has been this way for five days now.

How can I tell what is going on?
asm.gif
Comment
Watch Question

Gary ColtharpSr. Systems Engineer

Commented:
What does the logical view look like? It should show you PV membership and wether or not it shows your replaced volume as part of the array.

I am confused because your config shows a hot spare.... if you marked a drive as faulty, it should have utilized the hot spare automatically and rebuilt the array.

Author

Commented:
RAID1 contains Slot 0 and Slot 1.
RAID5 contains Slots 2-5.

The hot spare seems to be marked as part of the RAID 5 array, rather than being available for any logical device to use.
asm2.gif
President
CERTIFIED EXPERT
Top Expert 2010
Commented:
the problem is that the disk that you left in the system had a large number of blocks that aren't readable.  It can take up to 60 seconds PER BLOCK to do a recovery worst case.   So do  the math ... could take a lot longer.

What you SHOULD have done  is kicked off a RAID consistency repair first. This would insure that there were no blocks that were in need of repair by performing the same action .. but it would at least have parity data, so it would not have to go into deep recovery.

The reason the hot spare is still marked as a hot spare is because it is actively rebuilding.

Your action item is to just let it run, but MAKE SURE YOU HAVE A FULL BACKUP.   The most stressful time for disks is during a rebuild, so if there is going to be a catastrophic failure, it is going to happen during this rebuild.

Don't recycle power, let it run..    Moving forward, get yourself a better RAID controller.  That adaptec controller is low end, and a better controller with battery backup cache would typically have done the rebuild in well under a day.  

Explore More ContentExplore courses, solutions, and other research materials related to this topic.