intel raid controller issue? Mirror drive fails.

I have 2 of these in a RAID mirror.  They are in a Dell Precision T5500 workstation.  I am using the onboard Intel RAID controller - not an addon.  The system is running a retail gas station about 1 hour from my physical location.  One of the drives was apparently failed the other day and marked bad.  I do not know of a way to diagnose the drive live in the system, as the data lifeguard software can't see the specific drives when in a mirror, and can't diagnose them.  I marked the drive as normal, the other drive replicated to it, and everything seems to be fine again.  Do you have any suggestion as to why this may have happened?  My experience is that this happens often with consumer class drives that aren't meant to run 24x7.  I have several SCSI and SAS situations in Dell PowerEdge servers (per i5/i6 raid controllers), and this never happens.  I purchased WDC RE drives in an attempt to have the best solution for Sata drives.  I googled the issue - and many people on the Internet list the same issue.  Is the issue that the RAID chipset isn't strong enough?  I don't feel that there is likely anything wrong with the hard drive.
DoingThisForeverDirectorAsked:
Who is Participating?
 
nobusCommented:
you can buy enterprise grade drives - but they cost more :  http://www.pcpro.co.uk/news/385792/consumer-hard-drives-as-reliable-as-enterprise-hardware

they also come with longer warranty !

**do not use onboard raid - aka fake raid; better use windows software raid
0
 
noxchoGlobal Support CoordinatorCommented:
This happens either because of the RAID controller error or because of the real problem with a drive. Where a bad sector could occur. To test the drive you would need to connect it as second drive to another system, in non-RAID mode and scan with Lifeguard.
Usually drives have a specific set of bad sectors designed to replace the bad ones. If the problem is in a single sector - the RAID re-mirroring would heal it by redirecting read-write access to one of these reserved sectors. This can explain why it is running ok in you case now.
But there where 1 bad sector occurs can occur another one. Thus I recommend you testing both drives, you can perform the test on another computer which does not have the RAID. Simply move the drive to it - test and connect back.
And don't forget about backup. There must be a backup solution taking backups of the entire system daily to either external drive or to network drive. The RAID mirror is no replacement for backup, especially one based on a fake RAID controller.
0
 
DoingThisForeverDirectorAuthor Commented:
Thanks nobus,
Yes - sorry I should have included what drives I was using.  I asked WDC and referred to the drive then posted it here too to see what others would have to say.  Yes, I am using Western Digital RE drives - which seem to be the right fit.  So if I take something from this - it would be to NOT trust onboard raid controllers - even if they are by Intel and on a really high-end Precision workstation.  Thanks again.
0
 
DoingThisForeverDirectorAuthor Commented:
Thanks noxcho,

Yes, I'm familiar with all of this.  The unit is running a POS system at a gas station about 1 hour away.  So, I don't have the liberties I would have if it were my own system.  Yes, we do take backups seriously.  The reason I posted here - is that I have had the same issue in the past (outside really high-end SCSI or SAS environments) with non-enterprise drives.  I felt that the issue here is the same thing - although I haven't proven yet whether the one drive is bad or not (due to the hassle).  I think the advice I will take from the other post is that I used the onboard raid controller (aka fake raid) - and should only use raid with add-on cards.  
Thanks for the help!
0
 
noxchoGlobal Support CoordinatorCommented:
Such issues happen not only on fake RAID controllers but on real controllers as well. The only way you can check the problem is testing  the drive standalone. Add-on cards are more stable in comparison with fake RAIDs. But if you are taking backups regularly then remove the RAID and use the drives as standalone drive 1 and drive 2.
I would suggest also a software mirror of two dynamic drives but dynamic drives have their negative side as well.
0
 
DoingThisForeverDirectorAuthor Commented:
I wonder if RAID at this level is really worth the trouble.  I am using Acronis Workstation 11.5 and one of the routines are full system images - so if a drive does fail as a single - we can get it up and going pretty quick.  I think I should just stop using raid - unless we are using SAS drives & controller.
0
 
nobusCommented:
raid was ok for getting higher speed in the past
now with SSD's it is less of a problem
0
 
Gerald ConnollyCommented:
What!

So you are happy for your customer to be down until you can swap the drive and restore their system? How long will that take , an hour to drive their and a couple of hours at least to replace the disk and restore it from the backups and then how many transactions will have been lost since the last Backup.

You should be using RAID-1 (aka Mirroring) for business continuity, BUT it is not a substitute for a properly thought out DR/DT plan including Backups!

NB Remember as soon as the first disk fails your Data is a risk from a second disk failure until the failed disk has been replaced and fully synchronised

And just take note the only certain things in Life are Death and Taxes
And that a hard disk will eventually fail :-)
0
 
Gerald ConnollyCommented:
Author seemed happy with main issue resolution, just hasnt closed the case!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.