[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

how to tell what caused the raid failure?

Posted on 2013-05-13
16
Medium Priority
?
417 Views
Last Modified: 2013-06-24
I have a hp proliant server. I would like to know in case of raid failure, how do I find out what cause it? From what I know there are several possibilities that could cause raid failure:

- 2 hard drives failed
- raid controller failed
- raid configuration corruption

how to identify in each of the above situation? and is there any other possibilities?
0
Comment
Question by:okamon
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
  • 3
  • +2
16 Comments
 
LVL 47

Expert Comment

by:David
ID: 39162491
The raid controller's event logs are a good start.  Are you using HP's smart array? If so download their software to look at the logs>
0
 

Author Comment

by:okamon
ID: 39162838
yes, it's hp smart array. How to access the log if i cannot even boot into windows?
0
 
LVL 56

Expert Comment

by:andyalder
ID: 39163383
Boot SmartStart CD (or Intelligent Provisioning in BIOS) and run ADU/ACU and then upload adureport.txt as an attachment to the thread and we will diagnose, or learn to read the report yourself.
0
NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

 

Author Comment

by:okamon
ID: 39163567
thanks. besides using smart start cd, is there a way to tell in bios, or in the raid controller setting?
0
 
LVL 47

Expert Comment

by:noxcho
ID: 39163791
If the RAID Controller has problems or it is bad then you will simply not have a chance to log into RAID Configuration utility. That should be a sign for you that problem could be with RAID controller and not the RAID set or drive.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 39164126
ORCA/BIOS isn't very powerful, it really has to be the ACU/ADU but with Gen8 that's built into the boot ROM anyway.
0
 

Author Comment

by:okamon
ID: 39164765
thanks. >>If the RAID Controller has problems or it is bad then you will simply not have a chance to log into RAID Configuration utility

how about if the raid configuration corrupted? can I still log into utility? if yes, how to tell it's corrupted?
0
 
LVL 47

Expert Comment

by:David
ID: 39164791
If the RAID configuration is corrupt, (which happens), then problem solved.  Root cause was controller, not HDD failure.
0
 
LVL 56

Accepted Solution

by:
andyalder earned 600 total points
ID: 39164795
If the controller is bad you can simply put the disks on another one of the same or later generation since all HP Smart Array controllers use the same metadata. It's pretty hard to corrupt the configuration since it's stored on every disk although taking them out and shuffling them can confuse it. The ADU is only a point in time snapshot though so if you have dual disk failure it won't tell which one failed first. For that you would need a previous report or you may get enough info out of the integrated maintenance log stored on the motherboard and again read by SmartStart diagnostics.
0
 
LVL 17

Expert Comment

by:Gerald Connolly
ID: 39166306
It's pretty hard to corrupt the configuration since it's stored on every disk although taking them out and shuffling them can confuse it.

When i worked in HP Presales we always used to tell people that you could take them out and juggle them, cuz you could!   :-)
0
 
LVL 47

Expert Comment

by:David
ID: 39166351
It is hard, but it happens, especially in multiple failure scenarios.  If somebody wants reliability bordering on infallible, then shift the decimal point over to the left in terms of cost ;)
0
 

Author Comment

by:okamon
ID: 39175885
dlethe, I am not sure what you mean here -> If the RAID configuration is corrupt, (which happens), then problem solved.

so how do I tell if it's corrupt? is it just like what noxcho mentioned that I wouldn't even be able to log on to the utility?
0
 
LVL 47

Assisted Solution

by:David
David earned 600 total points
ID: 39175989
Signs of it being corrupt are basically no configuration found messages; grossly incorrect settings, like it saying you have a 37-disk RAID3, or error messages saying the configuration is corrupt or invalid.
0
 
LVL 47

Assisted Solution

by:noxcho
noxcho earned 600 total points
ID: 39176173
The the sign will be no bootable device in case system runs from this RAID. If RAID is data drive then it will not be accessible from Windows or throughing errors.
0
 

Author Comment

by:okamon
ID: 39192539
so how do I repair it? will I lose all my data?
0
 
LVL 47

Expert Comment

by:David
ID: 39192654
No way to give you a definitive answer in all cases w/o having somebody run some analysis.  I doubt anybody will do that for free as it requires talent and software that you don't have, which means sending drives off and paying for a forensic storage expert to look at it.

It is more money than you want to pay. Trust me.  

If you want something that will catch most problems with a high degree of confidence, then look at controller event logs as that is good enough for most people.
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Concerto Cloud Services, a provider of fully managed private, public and hybrid cloud solutions, announced today it was named to the 20 Coolest Cloud Infrastructure Vendors Of The 2017 Cloud  (http://www.concertocloud.com/about/in-the-news/2017/02/0…
In this article we will learn how to backup a VMware farm using Nakivo Backup & Replication. In this tutorial we will install the software on a Windows 2012 R2 Server.
This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question