Replacing a drive in a Raid 1+0 array

I have an HP DL380 G6 server with a Smart Array P410i controller.  The operating system is installed on a two disk raid 0+1 array.   One of the hot swappable disks is generating a predictive failure alert.  

I have a replacement drive on hand.  Can I just eject the drive that is give me the error and put the new one in its place?  

Or do I need to shut it down and boot it with the smart start CD and Break the array before exchanging the disk?   If I break the array how do I specify which disk to use as the primary so I keep all the data on the disk?
qvfpsAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Paul MacDonaldDirector, Information SystemsCommented:
A RAID 0+1 would require at least 4 disks, so there's a problem right off the bat.

Given that you don't know for sure whether or not the drive is hot-swappable, I'd shut the computer down before attempting a replacement.
0
DavidPresidentCommented:
never, never, NEVER take an online RAID and degrade it (remove redundancy) to replace a drive that hasn't failed.  Murphy's laws dictate the other disk in that RAID1 pair will take the opportunity to die during the rebuild.  That will result in 100% data loss.


Put that new disk in as a hot spare (if you want), then make a full bootable backup.  Then and only then can you safely replace the drive.  Do it with power on.  Out with the old, in with the new in the same slot. It will automatically rebuild.  You have a full backup just in case.

Also since the other disks all came from same batch and are all probably past warranty, best practice is to consider buying several replacement drives and keeping a hot spare in there.
0
David Johnson, CD, MVPOwnerCommented:
It depends ... if the drives are hot-swappable or not.. if they are.. remove and replace the bad drive and things will rebuild.. Best to do it in a slow period .. Black Friday is not a good day.. remember RAID is NOT a backup it is a performance, availability resource and a RAID rebuild stresses every drive and marginal drives may fail.

If you are unsure and have never done this.. hire a professional and watch.. or have the pro walk you through it.. The money you spend now will enrich your knowledge in the future.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

DavidPresidentCommented:
the drives are always going to be hot swappable, unless they are physically inside the enclosure or bolted down.  ALL SAS, SCSI, Fibre channel, and SATA disks are hot swappable from the electronic signaling/power perspective. It is part of the spec.
0
qvfpsAuthor Commented:
Unfortunately all the drive bays are full so I can not add a new drive as a hot spare.  My only options are to do an online swap or break the array.
0
David Johnson, CD, MVPOwnerCommented:
you have the drive light telling you it is bad.. unplug the carrier, change it, replace the carrier in your if you've enabled the raid controller to send notifications to the system tray you could open it and see the status.. it should say rebuilding.
0
qvfpsAuthor Commented:
The drive light is still green.   I am receiving events in the system log that the drive is in predictive failure.
0
DavidPresidentCommented:
Don't break the array for a predictive failure replacement when you have old disks.  If disks are beyond factory warranty, best to either
  - leave things as they are and wait for a real failure (and keep backups) -- remember costs for disks get lower every month, so no need to replace just because of a 'predictive' failure.   Wait for a real failure.
 - OR replace ALL of the old disks at once with a newer make/model that likely has better performance and higher capacity anyway.  Do a full bare metal backup during downtime window.
0
andyalderCommented:
You really have no option except ejecting it and putting another one in its place (always a good idea to backup first). There is no advantage in assigning a spare since you can't copy the pre-failure disk to the spare with that controller, it'll only get used when the drive fails and then it's rebuilt from the mirror which is the same as popping the old disk out and replacing it. If you assign a spare it'll rebuild onto the spare when you remove a predictive-failure disk but then when you fit the replacement it'll have to rebuild back to the original disk slot anyway.

You should check the firmware level first as there are false predictive-failures on some revs of firmware on some disks, once pre-failure has been logged though it generally remembers that even through firmware upgrades.

Ignore the fact that the ACU shows it as RAID 1+0 with only two disks, it's just a quirk of HP's software, it's actually running a simple RAID1 algorithm although the strip size is predefined in the metadata should you add disks to expand it in future.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Server Hardware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.