Solved

What happens when a RAID drive fails?

Posted on 2008-10-01
10
1,216 Views
Last Modified: 2012-06-27
I've never actually had a situation where a drive as part of a RAID array has failed in a server. So, suppose one of the drives in the RAID 1 mirror in one of my HP ProLaint ML110s (please don't comment about how the "decent ProLiants start at ML370" - this particular server is for a small network with limited funds) failed, what would happen?

Secondly, if a drive did fail, I assume I'd obviously need to use some tool to determine which drive has failed, and replace it. Would I need to initialize the new drive once it is installed? I am assuming it needs to be made a part of the appropriate RAID array, but after this stage, I assume the data will be rebuilt from the remaining drive?

We are talking hardware RAID here - configured in the RAID configuration pages available when the server boots through the BIOS screens.

Thanks!
0
Comment
Question by:tigermatt
10 Comments
 
LVL 7

Expert Comment

by:aboredman
ID: 22618680
I will make the assumption that the ml110 is not hot swap.

When you drive fail you will get a message in the windows event log (it is important to monitor it often). For hotswap machine you usually have an indicator on the drive.

You then need to shut down and replace the drive (if not hot swap)

Restart the machine and the raid array should take care of everything else.

BTW: You will probably see a warning message in the RAID controller post in the boot process if one drive is at the FAILED or PREDICTIVE FAILURE status.
0
 
LVL 17

Assisted Solution

by:Andres Perales
Andres Perales earned 125 total points
ID: 22618710
Tiger,
The best place would for you to call HP support and speak to them directly, every company has different guidance.
I use mostly dell, we we get a hard drive that fails that drive light turns to orange / red, we also get a noticfication in our open manage dells, server monitor systems that come with their server line.  Normally we just pull the bad drive and insert the new drive and the rebuild is automatic ( on certain servers) Other servers we have been told to shut down the server, pull the bad drive bring system back up and then it will start to rebuild automatically.  All the while getting light indications from the front of the servers as well as notifications and progress using open manage.
0
 
LVL 3

Expert Comment

by:omic_admin
ID: 22619304
abordman is right, but do pay attention to the failed drive! You don't want to take out the one that is working. non-hotswaps need to be shut down before you replace the drive, but it would be best to just restart first, go into the RAID bios, and determine the HD id, and verify the failed drive before replacing.
0
 
LVL 3

Assisted Solution

by:tempter
tempter earned 125 total points
ID: 22621015
Hey mate
Like the comments above and im sure you've read about RAID drives, you can only pretty much loose 1drive (in general). It becames very important to ensure that you replace that failed drive as soon as possible  so that you are not at risk of loosing the whole RAID drive and loosing all your data.

In your case, all your server hardware configured with RAID should have the RAID software installed on it. Yes you are right that you can access the RAID configuration when the server starts by pressing Fn keys but you can avoid rebooting server by having the RAID software installed. Dell has the DRAC Raid manager, IBM has ServeRAID manager, HP has Iits own,etc for every maker there is one.

You should install this software for the hardware you have and it can be setup also with a Hardware management software that can send you sms alerts, email or other method to notify you when something fails. This way you are better informed when something goes wrong or before catastrophy hits you. Its better to be safe than sorry.  

When you purchase h/w from the big boys (HP, IBM,etc) you get Hardware Management software called HP Insight Manager (for HP), IBM Director (for IBM) Dell OpenView, etc and can do the things mentioned above. You can remote control your server, power off back on again and other things to help you better manage your network.

Hope this helps
Morci
0
 
LVL 67

Accepted Solution

by:
sirbounty earned 250 total points
ID: 22623335
Hello tigermatt,

Just to clarify - in a RAID 1 (mirror) scenario, you can potentially lose more than one drive.  More accurately, you can lose one mirror.  If you have an array consisting of 3 drives mirrored to 3 other drives, you could lose all 3 in the mirror and the data would still be intact, yet no longer redundant, obviously.
With Compaq/HP equipment, if a drive has truly failed, the drive light will change from green to amber, indicating a hard failure.
However, the thing to watch out for is predictive failures.  Compaq diagnostics periodically run functional tests against your drives.  If any of those multiple tests fail, the drive would then be 'exptected' to fail at some point in the near future.  That's your indication to get ready to replace it, and properly schedule the maintenance after hours.

Also, I wouldn't rely on event logs to determine if you have a drive failure - use HP tools.  If you have a copy of smart start, or have otherwise installed the ACU (Array configuration utility), you can monitor information about both the logical and physical layout of your array from there.  Even better would be to utilize snmp monitoring, if you have that capability, so that you have an alert generated when a drive fails, so you don't have to be as proactive about manually monitoring the components...


Regards,

sirbounty
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 58

Author Comment

by:tigermatt
ID: 22626724
Thanks to all for your input. I have the "Easy Set-up CD" which came with the server when I bought it - would I found the RAID Array Tools on there?
0
 
LVL 58

Author Comment

by:tigermatt
ID: 22626729
Thanks to all for your input. I have the "Easy Set-up CD" which came with the server when I bought it - would I find the RAID Array Tools on there?
0
 
LVL 67

Expert Comment

by:sirbounty
ID: 22626757
See this PAQ: http://www.experts-exchange.com/Hardware/Servers/Q_23028271.html
Info on ACU & CIM/SIM for monitoring...
0
 
LVL 58

Author Closing Comment

by:tigermatt
ID: 31502154
Thanks guys. This definitely answers things about the actual notification side of things. Could you take a look at http://www.experts-exchange.com/Q_23788449.html for me - I have a few questions still about HP Insight Manager, a piece of software I definitely want to use for RAID monitoring of several of these servers.
0
 
LVL 58

Author Comment

by:tigermatt
ID: 22644199
Any ideas on http://www.experts-exchange.com/Q_23788449.html as per grading comment? Thanks!
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Every server (virtual or physical) needs a console: and the console can be provided through hardware directly connected, software for remote connections, local connections, through a KVM, etc. This document explains the different types of consol…
The article will include the best Data Recovery Tools along with their Features, Capabilities, and their Download Links. Hope you’ll enjoy it and will choose the one as required by you.
In this Micro Tutorial viewers will learn how to use Windows Server Backup to create full image of their system. Tutorial shows how to install Windows Server Backup Feature on Windows 2012R2 and how to configure scheduled Bare Metal Recovery backup.…
In this Micro Tutorial viewers will learn how they can get their files copied out from their unbootable system without need to use recovery services. As an example non-bootable Windows 2012R2 installation is used which has boot problems.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now