Raid 60 failure

Posted on 2012-09-11
Last Modified: 2016-06-27
I have a 30 drive array that is broken up into a raid 6+0 arrays.  The raid controller software was reporting that a drive had failed, so I put in a new drive and during the rebuild process two more drives in the same array "failed".  The controller reports them as having failed, but I would guess that they are just giving ecc errors and it has dropped them from the array.  

Does anyone have any experience with forcing a drive that failed online, or would it be possible to take a drive out and dd/sector clone it to another drive so it will rebuild correctly.  I only ask as having to restore the 40+Tb of data from tape is not something I am looking forward to.

I am using an LSI MegaRaid adapter on an Ubuntu Server with the MegaRaid storage manager software.
Question by:AggieTex
    LVL 46

    Accepted Solution

    If you don't want 100% data loss, then you really need to hire a pro.  Whomever does it is going to have to do some parity/xor testing, and take structures apart to look for time stamps on the data. Also the disks themselves have to be assessed for health.  The Megaraid event log would be of use, so do NOT blow that away by clearing anything out.

    If you don't want 100% data loss (which is what you get with RAID60), then don't even think of doing this yourself.

    If you force disk(s) online it guarantees data corruption of all 40TB, due to the errors you already have.  The amount of damage can be assessed but that takes a lot of experience that certainly can't be transferred in this forum

    No you can not dd it.  The metadata won't be right. You also want to preserve the log pages internal of each disk before you dd, and also figure out what blocks are unreadable and do diagnostics.

    It may not be that bad to PROPERLY reconstruct just that one stripe, and hope that the rest of the 40TB is OK.  

    The rebuild failed due to multiple error scenarios.  A combination of unrecoverable read errors on the surviving disks and/or possible parity mismatches.  No way could I walk somebody through this, as you wouldn't even have the software you need.  You also certainly need a JBOD controller and scratch drives.

    Hire a pro to help you, if you don't want to restore all 40 TB.
    LVL 46

    Expert Comment

    The question was basically, "Does anyone have any experience with forcing a drive that failed online, or would it be possible to take a drive out and dd/sector clone it to another drive so it will rebuild correctly."

    Answered in #38389031, along with many reasons  -- don't do it.  Points should be rewarded as this is also valuable information for others.  Forcing a drive online in failed array is not a viable choice.

    Featured Post

    How to improve team productivity

    Quip adds documents, spreadsheets, and tasklists to your Slack experience
    - Elevate ideas to Quip docs
    - Share Quip docs in Slack
    - Get notified of changes to your docs
    - Available on iOS/Android/Desktop/Web
    - Online/Offline

    Join & Write a Comment

    In this article we have discussed the manual scenarios to recover data from Windows 10 through some backup and recovery tools which are offered by it.
    Storage devices are generally used to save the data or sometime transfer the data from one computer system to another system. However, sometimes user accidentally erased their important data from the Storage devices. Users have to know how data reco…
    This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…
    This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

    746 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now