How to swap out a failing SATA drive on an HP server with B110i SATA RAID controller

Greetings,

I have recently taken over an account that has a branch office with 3 older HP ProLiant servers and c. 12 PC's.  One of the servers is an older ProLiant ML110 G6 server running Windows Server Standard 2012 R2 with 12 GB RAM.   It is a Domain Controller with no principal applications on it.  It uses an HP Smart Array B110i SATA RAID controller with 2 arrays of SATA drives in RAID-1 config.  The 1st SATA array has 2 X 250 GB drives in Box 1 / Bays (Ports) 1 and 2 , and the 2nd SATA array has 2 X 1 TB drives in Box 1 / Bays (Ports) 3 and 4.

We are seeing a warning message when the server is rebooting, warning of an Imminent Failure of one of the hard drives.  I have attached a screen shot of the screen warnings.  Boot time warning from B110i SATA RAID controller - imminent drive failure.  The failing drive is in Box 1 Bay 2, so it's 1 of the 250 GB drives.  Specifically it is a model ATA VB0250EAVER, F/W version HPG0.

My question on this is whether this is truly a hot-swappable drive + controller.  The message in the attached screen shot says to make sure to only change out the failing drive when all drives are on.    I'm hoping someone who has worked with this controller is familiar with the proper way to swap out the failing drive.  I've worked with Dell servers for c. 25 years and the process there is beyond simple.  With Dell servers, a hot-swap chassis, and a PERC RAID controller, all I have needed to do is a) offline the failing drive, b) pull the drive, and c) put in the new / replacement drive.  Once the drive is replaced, it starts to rebuild.  With this HP B110i controller, I've read in some articles that the host server should never be shut off when the drives are replaced.  I'm also wondering if there is a similar process to offline the failing drive before you pull it out.  Past that I have zero experience with these controllers, and definitely do not want to pull out the hard drive when the system is running if it is not hot-swappable.

Separately, I've had the client get in touch with HP for out-of-warranty support, but they are looking to charge c. $850 for a single out-of-warranty support call (yow!!!).  So I'm hoping someone can provide advice and a drive-swap procedure from their experience on this before the client has to fork up a serious chunk of change to HP.

Thanks in advanced for any assistance on this.

jkirman
jkirmanPrincipalAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Dr. KlahnPrincipal Software EngineerCommented:
Back up the failing array to an external device.  Then back it up again to a different device using different backup software.

Buy three drives to replace into the array, replace both drives in the array and then restore from the backup.  Put the third drive on the shelf and label it "RAID HOT REPLACEMENT FOR SYSTEM blah blah **ONLY**" so that (a) you have it immediately available in case of a future problem and (b) nobody steals it for some other system.

Why both drives?  If one drive is failing, the other one is probably also reaching end of life.  Drives are cheap; human time is not.  In any case, you will want two as-close-to-identical drives in the array and it may be difficult to obtain a drive to match the failing one.
0
jkirmanPrincipalAuthor Commented:
Dr. Klahn - thank you for your recommendations on backing up to 2 media and then replacing the current RAID-1 drives.  However, your response did not answer my question.  Perhaps I wasn't clear, but when you say replace the drives - that is specifically what I'm asking about.  So for example -

1) I am requesting confirmation that the drives and controller are designed for hot-swap, or do I need to power off the server.  I'm 99% sure this is hot-swap, but would prefer someone who has worked with this already can confirm on that.

2) Do I just pull out the failing drive, or do I need to do something with the Array Configuration Utility or the Smart Storage Administrator CLI to prep the system?

3) As I'm looking at the ACU GUI right now using the Physical View, it shows all 4 drives in the system, with the 250 GB SATA in Box 1 Bay 2 showing a yellow exclamation mark.  When I right-click the drive, I only get options for More Information and View Status Alerts.  There are no options or controls anywhere to e.g. Offline the drive, Deactivate the drive, etc.  Again, I'm coming from a Dell world where the PERC controller and Open Manage software provide a lot of functionality and control on this part, and you can e.g. :

a) Offline a failing drive
b) pull out the drive after it shows Offline
c) put in a replacement drive
d) watch the drive rebuild

and that's basically it.  Here, with the B110i and the ACU GUI, I am not seeing any controls on how to physically prep the system to remove the drive as I would be able to with a Dell system, per a) thru d) above.

So again, since I definitely do not want to screw up the hardware, I'm requesting a true step-by-step description of the physical removal process, possibly working with or using the ACU GUI if that is helpful to the process.

Thank you.

jkirman
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Pull out the failing drive and replace with a good working drive, it will start the rebuild automatically.

Nothing further for you to do.

I assume you already have good working backups, just in case you need to restore.
0
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

andyalderSaggar maker's framemakerCommented:
Unfortunately the ML110 G6 is not hot-plug and there are no drive LEDs to identify which is which.

You will have to go into the ACU and get the serial number of the predictive fail disk, then power off and replace it with a new one and power on again and check again that you removed the right one. If you only have a second hand disk you may have to zero out the first few blocks with a PC since you can't hot-plug it easily.
1

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
WakeupSpecialist 1Commented:
Some good information/instructions here:
https://support.hpe.com/hpsc/doc/public/display?docId=c02279604

But Andy's right, not hot-pluggable.
1
jkirmanPrincipalAuthor Commented:
Thanks for your collective thoughts.  From everything you've written, looks like the approach should be:

- make a note of the serial number of the failing drive per the ACU interface  (already done)
- shut the server
- pull out the failing drive from Box 1, Bay 2, as per the S/N
- make a note of the replacement drive's model and serial number
- install the replacement drive
- power on the server

Now here's the part I'm assuming, since I've never dealt with this specific controller:

- Let the system boot normally
- After booting and login, go into the ACU and I'll see the RAID-1 container rebuilding

I've read through a few online postings by admins with similar issues with this controller, and get the impression that I should NOT go into the B110i BIOS at boot time or interrupt the boot process, but rather should simply let everything start up on its own, once I've replaced the failing drive.

Appreciate any confirmations on the above / latest post in advance, and thanks again for the info and thoughts to date.

jkirman
0
andyalderSaggar maker's framemakerCommented:
The S/N on the paper label may not be the same as electronic one the ACU displays but check it afterwards to make sure you replaced the right one.

Yes, use the ACU rather than the BIOS utility to check it is rebuilding as it is a fakeRAID controller and does not rebuild until the OS is running. There is no option to take it offline like there is with Dell/LSI controllers, nor is there an option to erase any foreign config on the disk.
0
jkirmanPrincipalAuthor Commented:
Andyalder, thanks for the additional details regarding the B110i controller,  Seems that these embedded / software controllers have mostly caveats attached to them, as it's somewhat mind-blowing for me to hear that you need to be in the O/S for the RAID-1 container to rebuild.  Then again, for the last couple of decades I've been creature of habit with Dell servers and PERC hardware controllers and have never used, nor even worked with, a software RAID controller.  I have also religiously avoided the S110 / S300 controllers and the like as I have read mostly complaints on performance of software RAID controllers as compared to e.g. the traditional PERC hardware controllers series.  I'm assuming they (s/w controllers) are recommended only for very budget-constrained situations.  Will hopefully be able to address the drive replacement in the next week or so and will advise of my experience with the ACU and the RAID-1 container rebuild.
0
jkirmanPrincipalAuthor Commented:
Thanks for your assistance and detailed information.  Keys here were that 1) the controller is not hot-plug - in its current form - and 2) that the RAID-1 rebuild will not take place until you boot into the O/S.  FWIW in my readings I found that you could upgrade the B110i to a hot-plug controller by adding a License Key, available from Amazon for c. $75, but since a straight replacement will work fine, I wouldn't be experimenting with hot swap options.

Cheers and many thanks again.

jkirman
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Server Hardware

From novice to tech pro — start learning today.