Solved

Server down - Intel Server bd. S3420GPLC - raid 10 failed.  How to determine what drive is bad

Posted on 2013-10-25
22
815 Views
Last Modified: 2013-10-29
How do I figure out which drive is bad or if both drives arebad?   Server will not boot which surpises me. Never had a Raid 10 go down hard.  If I hit the "control L"   when booting I can see it is complaining about drive 0 & 1......I'm running the uefi utility for this server board collecting info.  I do not see any helpful info in these files except when the server went down.

Thank you.....I need help.

fs0:    
sysinfo.efi
sysinfo-log.txt
PCI-log.txt
0
Comment
Question by:Joemt
  • 9
  • 8
  • 4
  • +1
22 Comments
 
LVL 42

Expert Comment

by:Davis McCarn
ID: 39602548
Get the trial version of http://www.hdtune.com 's pro version and install it on a working PC.  Either use the CD/DVD cabling or add another SATA connection and test each drive individually with HDTune.  Look on the health tab and the key is the data value of Reallocated Sector Count.
Before you start, mark the locations of the drives!
With any luck at all; we'll find we have one good pair so we can rebuild.
0
 
LVL 20

Expert Comment

by:SelfGovern
ID: 39602617
Are the drives you're using certified for RAID environments?  You'll find lots of tales of woe here from people who ran the cheaper, non-RAID drives in RAID configurations, and all works well until errors in one drive reveal fatal errors in others (that had been previously masked).  If so, your chance of recovery is not good without extraordinary measures.

Use only RAID-certified drives in RAID configurations.
0
 
LVL 30

Expert Comment

by:pgm554
ID: 39603252
Raid 10 ?

How many drives all together?

What type of drives (make and model number)?
0
 

Author Comment

by:Joemt
ID: 39605409
The drives are  250 Seagate Constellation drives (Server Raid Drives). This is a RAID 10 with 4 drives. When I hit the Control I to enter the Raid array utility it shows that drive 0 and 1 as red and having issues. What I can not tell is if that means both drives are bad or if that Pair is just having an issue.  Raid 10 is not suppose to go down if one drive is bad of the pair........I've never seen both drives of a pair suddenly fail.

How is it a company like Intel offers raid  capabilities on their server boards but not a real utility to manage the raid array and fault isolation.  Is there really no way to determine which drive failed with a utility provided by Intel?
0
 
LVL 42

Expert Comment

by:Davis McCarn
ID: 39605494
I hate to say it; but, no.....
Read my first post
0
 

Author Comment

by:Joemt
ID: 39605689
I plan on doing that today. I also want to propose to the client to upgrade to a LSI Raid controller and 4 new 1 TB Seagate constellation ES3 drives (sata).  

Anyone have a recommendation on which LSI controller for a RAID 10 Reasonable priced controller.  

Thank you
0
 
LVL 30

Expert Comment

by:pgm554
ID: 39606179
Intel server RAID is nothing but xor on a chip. ,so you have found out the hard way.

I wouldn't buy any controller that didn't have cache or a co processor on it.

http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=1617531&csid=_61
0
 

Author Comment

by:Joemt
ID: 39606434
Yeah, I knew, but sometimes it is difficult to get a customer to understand when you are providing a bid. I try to bid a LSI controller in all servers.

I found the bad drive. PD1    The raid utility states the array is not bootable. Any ideas as to why? I am currently running the long Seagate tools test on the PDo to see if there are errors.

Couple Question: 1. Do I dare boot the server with 3 drives? 2. When I install a new drive will it auto rebuild?
0
 
LVL 42

Expert Comment

by:Davis McCarn
ID: 39606756
Generally, you need to hiy CTRL-I during boot to get to the Intel RAID Bios and add the new drive as a spare so it will rebuild.
0
 

Author Comment

by:Joemt
ID: 39606788
Well The second drive has failed the long generic drive test.   What are the chances of that happening. Both drives are previous Seagate warranty replacement (refurbished) drives.
0
 
LVL 42

Accepted Solution

by:
Davis McCarn earned 500 total points
ID: 39606907
I prefer to test with HDTune which has a read scan tab that won't try to write anything (which is a KEY ITEM when attempting data recovery)  Unfortunately; virtually all of the manufacturers diags will write which can be the straw that breaks .......
Between it and the Health tab where Reallocations are the sign of impending disaster, I can judge quickly what shape the drive is in.
It's a shame there is no version which works on RAID.
While I like Seagate very much; I have also received a "Warranty replacement" with 65,000+ hours to replace a drive that had less than 1,000.
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 

Author Comment

by:Joemt
ID: 39606931
I suspected both had failed or the server would be running and it called out both drives.

We have good backups.
0
 
LVL 30

Expert Comment

by:pgm554
ID: 39607041
Are you running Seatools?
0
 
LVL 42

Expert Comment

by:Davis McCarn
ID: 39607051
Don't you have four drives?
It can't be RAID 10 unless you do!
0
 

Author Comment

by:Joemt
ID: 39607099
Yes I do...but it was both drives on one side of the Raid 10. RAID 10 has 2 pairs of RAID one drives that are then striped.
0
 
LVL 42

Expert Comment

by:Davis McCarn
ID: 39608355
So you lost one of the RAID 0's or was it the same side of each of the RAID 1's?
You might be able to recover with a cloning; but, Intel's RAID is finicky about it.
0
 

Author Comment

by:Joemt
ID: 39608557
Lost Physical drives 0 & 1 and the raid array is no longer bootable
0
 
LVL 42

Expert Comment

by:Davis McCarn
ID: 39608723
Yes; but that doesnt tell either of us where they were logically.  If its stripe 0 of both mirrors its a huge problem.  If its stripe 0 of one and stripe 1 of the other, we still have a good set.
0
 

Author Comment

by:Joemt
ID: 39609184
I agree.....but since it cannot boot. and the Raid bios appears as if it looks at the drives in sets and states the array is not bootable. I think both of these drives were one set. Never seen it before. I hope to never see it again.  :)

The customer has decided to upgrade to a LSI 9211-4i controller and (4) 2TB drives (I'm waiting to hear from LSI if the controller can handle (4) - 2TB drives.  Anyone know for sure? If not I hope it will handle (4) 1TB drive as an alternative.

Thank you
0
 
LVL 42

Expert Comment

by:Davis McCarn
ID: 39609417
0
 
LVL 30

Expert Comment

by:pgm554
ID: 39609505
But can the mobo bios handle a uefi boot if you use >2tb partitions?
0
 

Author Comment

by:Joemt
ID: 39609710
LSI stated the 9211-4I t can handle up to 4TB drives.........
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

The Samsung SSD 840 EVO and 840 EVO mSATA have a well-known problem with a drop in read performance. I first learned about this in an interesting thread here at Experts Exchange: http://www.experts-exchange.com/Hardware/Storage/Hard_Drives/Q_2852…
The article will include the best Data Recovery Tools along with their Features, Capabilities, and their Download Links. Hope you’ll enjoy it and will choose the one as required by you.
This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now