Solved

bad drive issues

Posted on 2014-11-10
6
197 Views
Last Modified: 2016-11-23
I have a client running SBS 2011, on a Power R720, They are running a Dell Power Edge R710 with 64GB (a waist), with a RAID 1, and a RAID 5 (3 drives), recently we had a drive fail and everything has done downhill.

Now Dell support is telling me the issue is related to bad blocks on drive 4, but the bad blocks have spread to 4 of the five drives (by the way Open manager is not reporting errors other then on one drive). and the only way to fix the issue is to copy the data, delete the RAID, replace all the drives and  reinstall Windows from scratch.

Has anybody ever heard of bad block spreading between drives on a RAID.

Thanks
Rudy
0
Comment
Question by:rudym88
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 40435016
It's called a puncture. RAID replicates data/parity across the disks, so if one block is bad or unreadable, what does the controller do? It either replicates bad data or fails a drive or even the array. This is why RAID is not a backup strategy.
0
 

Author Comment

by:rudym88
ID: 40435071
Can I ask what is the best was to resolve the issue. other then redoing everything?
0
 
LVL 33

Accepted Solution

by:
PowerEdgeTech earned 500 total points
ID: 40435132
Wiping out the array and recreating (and initializing) it is the only way to create a healthy array to hold your data. The array - the logical arrangement of storage units managed by the controller - is irreparably broken - missing a piece(s). The controller cannot guess at the missing/corrupt area's contents, and there is no software capable of restoring specific pieces of the array like with file backups.

I would also recommend you update all system firmware (iDRAC/LCC, BIOS, PERC, etc.) and drivers, run regular Consistency Checks (at least monthly), and promptly replace faulty drives to help prevent it from happening in the future.
0
Get Actionable Data from Your Monitoring Solution

Your communication platform is only as good as the relevance of the information you send. Ensure your alerts get to the right people every time with actionable responses. Create escalation rules that ensure everyone follows the process and nothing is left to chance.

 
LVL 55

Expert Comment

by:andyalder
ID: 40435586
Only other way to repair a punctured stripe is to overwrite the bad blocks and that's virtually impossible since your OS may have already written a $err entry into the file allocation table. Not sure where you can get to the disk error stats with Dells, with HP the drive errors are listed in the ADU report. Blasted disk manufacturers don't consider disk read errors as a reason to set the pre-failure alert so its possible to run with a flakey disk or two without knowing it, then a disk fails and the rebuild doesn't complete when it is replaced.
0
 

Author Comment

by:rudym88
ID: 40446416
That what I was afraid, let me ask how can I prevent this from happening?
0
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 40446614
I would also recommend you update all system firmware (iDRAC/LCC, BIOS, PERC, etc.) and drivers, run regular Consistency Checks (at least monthly), and promptly replace faulty drives to help prevent it from happening in the future.
0

Featured Post

Simple, centralized multimedia control

Watch and learn to see how ATEN provided an easy and effective way for three jointly-owned pubs to control the 60 televisions located across their three venues utilizing the ATEN Control System, Modular Matrix Switch and HDBaseT extenders.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Usually shares are where we want them for our users and we tend to take them for granted. There are times, however, when those shares may disappear causing difficulty for your users. One of the first things to try is searching for files that shou…
Hyper-convergence systems have taken the IT world by storm and have quickly started to change our point of view of how the data center should and could be architected. In this article, I’ll explain the benefits of employing a hyper-converged system …
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…

690 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question