plug1
asked on
Replacing a failing dirve in an HP DL380 G9 while its expanding the array.
Having a nightmare with a server which has a failing hard sisk and is dragging the server down completely to a halt. The worst thing is we added 3 drives to the array and started the expansion just before the drive started alerting. This expansion is going to take weeks, is it possible to remove th failing drive (its not failed yet or I would just tak e it out) mid expansion?
ASKER
Apologies its only RAID5, its went from 5 disks to 8 disks and all are 1.2 TB disks.
If you have backup then I would stop it and replace the failing drive.
ASKER
I have backups but this is preventing any newones completing and I dont want to rebuild if I can avoid it. I dont think I can actually stop it tbh. If I knew I could replace the drive I would just do that.
You can replace a failed or failing disk during a rebuild, the controller keeps track or the releveling process so it knows how far it has got and does not have to start the process again.
Usually HP has decent RAID controllers which can react smartly to interruption of rebuilding process. Means that if you shut down the server, replace the bad drive and start it the rebuilding process must restart.
As you don't have many choices here (wait till it finishes without knowing if it will finish at all or take the risk knowing you have the backup) I would weigh both options and go with most appropriate one.
You can also try contacting HP and asking them if their controller would be ok with interrupting the rebuild.
As you don't have many choices here (wait till it finishes without knowing if it will finish at all or take the risk knowing you have the backup) I would weigh both options and go with most appropriate one.
You can also try contacting HP and asking them if their controller would be ok with interrupting the rebuild.
ASKER
AndyAlder thats exactly what I hoped but can you link me to anything for me to verify that's the case for peace of mind before I pull the drive?
The controller is an HP p440ar controller wit battery backed cache etc if that helps anyone?
The controller is an HP p440ar controller wit battery backed cache etc if that helps anyone?
ASKER
BTW HP have been a little evasive on the question thus far. Ive been in touch constantly as the RAID controller was initially testing as failed and thats been replaced already. The disk didnt show us where the problem really lay until we kicked off the expansion ..
This happened 2 days before I went on holiday as always and Im typing this from a hotel in tenerife :(
This happened 2 days before I went on holiday as always and Im typing this from a hotel in tenerife :(
I cannot find proof, however parity is maintained during expansion because it does it one stripe at a time so half way through some of the array is made of let's say 6 drives in RAID 5 and part is made of 9 drives in RAID 5. It uses the write cache to keep track of which stripes it has migrated. Unfortunately as it uses the cache for expansion it is not available for normal caching which can hammer performance.
ASKER
AndyAlder, when you explain it like that it makes complete sense. Dont think I can convine one of my team to pull the drive until Im back though lol. Ive emailed HP again asking the question a little more directly to see if I can get something.
If anyone else wants to add to this it would be appreciated. Help a fellow IT bod calm down on holiday :)
If anyone else wants to add to this it would be appreciated. Help a fellow IT bod calm down on holiday :)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Andy, that's pretty much sealed it.. I think the drive is getting pulled. I will report back.
Would still like to see the real report uploaded or at least a snippet of the naughty disk stats pasted as comment, helps others in similar go/no-go situations. Sometimes there's no backup, other times it's just a speed problem.
ASKER
I'll see if my colleagues can post for you, really appreciate your involvement so far. We sent them to HP so should have them
ASKER
Hi Andy, you were 100% right, no issues at all swapping the disk and its now rebuilding and expanding the array at a great spped while running the virtuals on it. Top advice! Much appreciated.
Didnt get thouse logs though as Im still in tenerife.
Didnt get thouse logs though as Im still in tenerife.
how many disks in array.. and what was the each size ?