bad drive issues

I have a client running SBS 2011, on a Power R720, They are running a Dell Power Edge R710 with 64GB (a waist), with a RAID 1, and a RAID 5 (3 drives), recently we had a drive fail and everything has done downhill.

Now Dell support is telling me the issue is related to bad blocks on drive 4, but the bad blocks have spread to 4 of the five drives (by the way Open manager is not reporting errors other then on one drive). and the only way to fix the issue is to copy the data, delete the RAID, replace all the drives and  reinstall Windows from scratch.

Has anybody ever heard of bad block spreading between drives on a RAID.

Thanks
Rudy
LVL 1
rudym88Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

PowerEdgeTechIT ConsultantCommented:
It's called a puncture. RAID replicates data/parity across the disks, so if one block is bad or unreadable, what does the controller do? It either replicates bad data or fails a drive or even the array. This is why RAID is not a backup strategy.
0
rudym88Author Commented:
Can I ask what is the best was to resolve the issue. other then redoing everything?
0
PowerEdgeTechIT ConsultantCommented:
Wiping out the array and recreating (and initializing) it is the only way to create a healthy array to hold your data. The array - the logical arrangement of storage units managed by the controller - is irreparably broken - missing a piece(s). The controller cannot guess at the missing/corrupt area's contents, and there is no software capable of restoring specific pieces of the array like with file backups.

I would also recommend you update all system firmware (iDRAC/LCC, BIOS, PERC, etc.) and drivers, run regular Consistency Checks (at least monthly), and promptly replace faulty drives to help prevent it from happening in the future.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

andyalderSaggar maker's framemakerCommented:
Only other way to repair a punctured stripe is to overwrite the bad blocks and that's virtually impossible since your OS may have already written a $err entry into the file allocation table. Not sure where you can get to the disk error stats with Dells, with HP the drive errors are listed in the ADU report. Blasted disk manufacturers don't consider disk read errors as a reason to set the pre-failure alert so its possible to run with a flakey disk or two without knowing it, then a disk fails and the rebuild doesn't complete when it is replaced.
0
rudym88Author Commented:
That what I was afraid, let me ask how can I prevent this from happening?
0
PowerEdgeTechIT ConsultantCommented:
I would also recommend you update all system firmware (iDRAC/LCC, BIOS, PERC, etc.) and drivers, run regular Consistency Checks (at least monthly), and promptly replace faulty drives to help prevent it from happening in the future.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Server Hardware

From novice to tech pro — start learning today.