We help IT Professionals succeed at work.

How to swap out a failing SATA drive on an HP server with B110i SATA RAID controller

654 Views
Last Modified: 2018-09-16
Greetings,

I have recently taken over an account that has a branch office with 3 older HP ProLiant servers and c. 12 PC's.  One of the servers is an older ProLiant ML110 G6 server running Windows Server Standard 2012 R2 with 12 GB RAM.   It is a Domain Controller with no principal applications on it.  It uses an HP Smart Array B110i SATA RAID controller with 2 arrays of SATA drives in RAID-1 config.  The 1st SATA array has 2 X 250 GB drives in Box 1 / Bays (Ports) 1 and 2 , and the 2nd SATA array has 2 X 1 TB drives in Box 1 / Bays (Ports) 3 and 4.

We are seeing a warning message when the server is rebooting, warning of an Imminent Failure of one of the hard drives.  I have attached a screen shot of the screen warnings.  Boot time warning from B110i SATA RAID controller - imminent drive failure.  The failing drive is in Box 1 Bay 2, so it's 1 of the 250 GB drives.  Specifically it is a model ATA VB0250EAVER, F/W version HPG0.

My question on this is whether this is truly a hot-swappable drive + controller.  The message in the attached screen shot says to make sure to only change out the failing drive when all drives are on.    I'm hoping someone who has worked with this controller is familiar with the proper way to swap out the failing drive.  I've worked with Dell servers for c. 25 years and the process there is beyond simple.  With Dell servers, a hot-swap chassis, and a PERC RAID controller, all I have needed to do is a) offline the failing drive, b) pull the drive, and c) put in the new / replacement drive.  Once the drive is replaced, it starts to rebuild.  With this HP B110i controller, I've read in some articles that the host server should never be shut off when the drives are replaced.  I'm also wondering if there is a similar process to offline the failing drive before you pull it out.  Past that I have zero experience with these controllers, and definitely do not want to pull out the hard drive when the system is running if it is not hot-swappable.

Separately, I've had the client get in touch with HP for out-of-warranty support, but they are looking to charge c. $850 for a single out-of-warranty support call (yow!!!).  So I'm hoping someone can provide advice and a drive-swap procedure from their experience on this before the client has to fork up a serious chunk of change to HP.

Thanks in advanced for any assistance on this.

jkirman
Comment
Watch Question

Dr. KlahnPrincipal Software Engineer
CERTIFIED EXPERT

Commented:
Back up the failing array to an external device.  Then back it up again to a different device using different backup software.

Buy three drives to replace into the array, replace both drives in the array and then restore from the backup.  Put the third drive on the shelf and label it "RAID HOT REPLACEMENT FOR SYSTEM blah blah **ONLY**" so that (a) you have it immediately available in case of a future problem and (b) nobody steals it for some other system.

Why both drives?  If one drive is failing, the other one is probably also reaching end of life.  Drives are cheap; human time is not.  In any case, you will want two as-close-to-identical drives in the array and it may be difficult to obtain a drive to match the failing one.
jkirmanPrincipal

Author

Commented:
Dr. Klahn - thank you for your recommendations on backing up to 2 media and then replacing the current RAID-1 drives.  However, your response did not answer my question.  Perhaps I wasn't clear, but when you say replace the drives - that is specifically what I'm asking about.  So for example -

1) I am requesting confirmation that the drives and controller are designed for hot-swap, or do I need to power off the server.  I'm 99% sure this is hot-swap, but would prefer someone who has worked with this already can confirm on that.

2) Do I just pull out the failing drive, or do I need to do something with the Array Configuration Utility or the Smart Storage Administrator CLI to prep the system?

3) As I'm looking at the ACU GUI right now using the Physical View, it shows all 4 drives in the system, with the 250 GB SATA in Box 1 Bay 2 showing a yellow exclamation mark.  When I right-click the drive, I only get options for More Information and View Status Alerts.  There are no options or controls anywhere to e.g. Offline the drive, Deactivate the drive, etc.  Again, I'm coming from a Dell world where the PERC controller and Open Manage software provide a lot of functionality and control on this part, and you can e.g. :

a) Offline a failing drive
b) pull out the drive after it shows Offline
c) put in a replacement drive
d) watch the drive rebuild

and that's basically it.  Here, with the B110i and the ACU GUI, I am not seeing any controls on how to physically prep the system to remove the drive as I would be able to with a Dell system, per a) thru d) above.

So again, since I definitely do not want to screw up the hardware, I'm requesting a true step-by-step description of the physical removal process, possibly working with or using the ACU GUI if that is helpful to the process.

Thank you.

jkirman
Andrew Hancock (VMware vExpert PRO / EE Fellow)VMware and Virtualization Consultant
CERTIFIED EXPERT
Fellow
Expert of the Year 2017
Commented:
This problem has been solved!
(Unlock this solution with a 7-day Free Trial)
UNLOCK SOLUTION
CERTIFIED EXPERT
Distinguished Expert 2019
Commented:
This problem has been solved!
(Unlock this solution with a 7-day Free Trial)
UNLOCK SOLUTION
WakeupSpecialist 1

Commented:
Some good information/instructions here:
https://support.hpe.com/hpsc/doc/public/display?docId=c02279604

But Andy's right, not hot-pluggable.
jkirmanPrincipal

Author

Commented:
Thanks for your collective thoughts.  From everything you've written, looks like the approach should be:

- make a note of the serial number of the failing drive per the ACU interface  (already done)
- shut the server
- pull out the failing drive from Box 1, Bay 2, as per the S/N
- make a note of the replacement drive's model and serial number
- install the replacement drive
- power on the server

Now here's the part I'm assuming, since I've never dealt with this specific controller:

- Let the system boot normally
- After booting and login, go into the ACU and I'll see the RAID-1 container rebuilding

I've read through a few online postings by admins with similar issues with this controller, and get the impression that I should NOT go into the B110i BIOS at boot time or interrupt the boot process, but rather should simply let everything start up on its own, once I've replaced the failing drive.

Appreciate any confirmations on the above / latest post in advance, and thanks again for the info and thoughts to date.

jkirman
CERTIFIED EXPERT
Distinguished Expert 2019
Commented:
This problem has been solved!
(Unlock this solution with a 7-day Free Trial)
UNLOCK SOLUTION
jkirmanPrincipal

Author

Commented:
Andyalder, thanks for the additional details regarding the B110i controller,  Seems that these embedded / software controllers have mostly caveats attached to them, as it's somewhat mind-blowing for me to hear that you need to be in the O/S for the RAID-1 container to rebuild.  Then again, for the last couple of decades I've been creature of habit with Dell servers and PERC hardware controllers and have never used, nor even worked with, a software RAID controller.  I have also religiously avoided the S110 / S300 controllers and the like as I have read mostly complaints on performance of software RAID controllers as compared to e.g. the traditional PERC hardware controllers series.  I'm assuming they (s/w controllers) are recommended only for very budget-constrained situations.  Will hopefully be able to address the drive replacement in the next week or so and will advise of my experience with the ACU and the RAID-1 container rebuild.
jkirmanPrincipal

Author

Commented:
Thanks for your assistance and detailed information.  Keys here were that 1) the controller is not hot-plug - in its current form - and 2) that the RAID-1 rebuild will not take place until you boot into the O/S.  FWIW in my readings I found that you could upgrade the B110i to a hot-plug controller by adding a License Key, available from Amazon for c. $75, but since a straight replacement will work fine, I wouldn't be experimenting with hot swap options.

Cheers and many thanks again.

jkirman

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions