Link to home
Start Free TrialLog in
Avatar of plug1
plug1Flag for United Kingdom of Great Britain and Northern Ireland

asked on

Replacing a failing dirve in an HP DL380 G9 while its expanding the array.

Having a nightmare with a server which has a failing hard sisk and is dragging the server down completely to a halt. The worst thing is we added 3 drives to the array and started the expansion just before the drive started alerting. This expansion is going to take weeks, is it possible to remove th failing drive (its not failed yet or I would just tak e it out) mid expansion?
Avatar of Sajid Shaik M
Sajid Shaik M
Flag of Saudi Arabia image

WHAT IS THE RAID LEVEL ?
how many disks in array.. and what was the each size ?
Avatar of plug1

ASKER

Apologies its only RAID5, its went from 5 disks to 8 disks and all are 1.2 TB disks.
If you have backup then I would stop it and replace the failing drive.
Avatar of plug1

ASKER

I have backups but this is preventing any newones completing and I dont want to rebuild if I can avoid it. I dont think I can actually stop it tbh. If I knew I could replace the drive I would just do that.
Avatar of Member_2_231077
Member_2_231077

You can replace a failed or failing disk during a rebuild, the controller keeps track or the releveling process so it knows how far it has got and does not have to start the process again.
Usually HP has decent RAID controllers which can react smartly to interruption of rebuilding process. Means that if you shut down the server, replace the bad drive and start it the rebuilding process must restart.
As you don't have many choices here (wait till it finishes without knowing if it will finish at all or take the risk knowing you have the backup) I would weigh both options and go with most appropriate one.
You can also try contacting HP and asking them if their controller would be ok with interrupting the rebuild.
Avatar of plug1

ASKER

AndyAlder thats exactly what I hoped but can  you link me to anything for me to verify that's the case for peace of mind before I pull the drive?

The controller is an HP p440ar controller wit battery backed cache etc if that helps anyone?
Avatar of plug1

ASKER

BTW HP have been a little evasive on the question thus far. Ive been in touch constantly as the RAID controller was initially testing as failed and thats been replaced already. The disk didnt show us where the problem really lay until we kicked off the expansion ..

This happened 2 days before I went on holiday as always and Im typing this from a hotel in tenerife :(
I cannot find proof, however parity is maintained during expansion because it does it one stripe at a time so half way through some of the array is made of let's say 6 drives in RAID 5 and part is made of 9 drives in RAID 5. It uses the write cache to keep track of which stripes it has migrated. Unfortunately as it uses the cache for expansion it is not available for normal caching which can hammer performance.
Avatar of plug1

ASKER

AndyAlder, when you explain it like that it makes complete sense. Dont think I can convine one of my team to pull the drive until Im back though lol. Ive emailed HP again asking the question a little more directly to see if I can get something.

If anyone else wants to add to this it would be appreciated. Help a fellow IT bod calm down on holiday :)
ASKER CERTIFIED SOLUTION
Avatar of Member_2_231077
Member_2_231077

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of plug1

ASKER

Andy, that's pretty much sealed it.. I think the drive is getting pulled. I will report back.
Would still like to see the real report uploaded or at least a snippet of the naughty disk stats pasted as comment, helps others in similar go/no-go  situations. Sometimes there's no backup, other times it's just a speed problem.
Avatar of plug1

ASKER

I'll see if my colleagues can post for you, really appreciate your involvement so far. We sent them to HP so should have them
Avatar of plug1

ASKER

Hi Andy, you were 100% right, no issues at all swapping the disk and its now rebuilding and expanding the array at a great spped while running the virtuals on it. Top advice! Much appreciated.

Didnt get thouse logs though as Im still in tenerife.