Solved

Using mdadm to recover from a multi-drive RAID5 failure

Posted on 2006-07-20
5
4,206 Views
Last Modified: 2013-12-16
I have a Centos4.2 server that has a software RAID-5 volume consisting of 4 disks, /dev/sd{a-d}7.  The other day it reported a failure on /dev/sdc7.  I was able to start rebuilding it but during the rebuild it reported another failure.

With the older raidtools I knew how to edit /etc/raidtab and set the known bad disk to failed-disk and then force a rebuild of the array.  I'd like to do that with this machine to see if I can recover the array, but I've never done it using mdadm.  /etc/mdadm.conf doesn't have much useful information in it, only:

ARRAY /dev/md0 super-minor=0
ARRAY /dev/md1 super-minor=1
...

How can I go about trying to tell mdadm that /dev/sdc7 is the truely failed disk and to try to rebuild using the other 3 disks?

-Bruce
0
Comment
Question by:brucepennypacker
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
5 Comments
 
LVL 43

Expert Comment

by:ravenpl
ID: 17152425
If You have two disks failed in raid5 volume You can't recover. That's the design of raid5.

If only sdc7 is failed, then
mdadm /dev/mdX -f /dev/sdc7 # hot-fail
mdadm /dev/mdX -r /dev/sdc7 # hot-remove sdc7 from mdX
mdadm /dev/mdX -a /dev/sdc7 # hot-add and start rebuilding
0
 
LVL 40

Assisted Solution

by:noci
noci earned 20 total points
ID: 17153252
Be aware that if /dev/sdc7 failed,
other partitions on /dev/sdc might also fail...., it's a bit depending on the error, if it's just bad block you might get away with it for now...

Also have a look at the smartmontools, these can help diagnose health of disks before failure.
(disks that is, not partitions) /dev/sd? , /dev/hd? etc.

Better prepare for other partitions on /dev/sdc failing.
0
 

Author Comment

by:brucepennypacker
ID: 17153830
ravenpl - As I said in my original post I have successfully recovered from multiple-disk RAID5 failures using raidtools.  It's possible to have multiple disks fail simultaneously if a drive controller fails, if a cable that multiple drives are on is loose, if an external disk array loses power, etc.  Here's a web page that describes how to do this using raidtools:

http://software.cfht.hawaii.edu/linuxpc/RAID_recovery.html

What I would like to know is how to do it using mdadm since that's replaced raidtools.
0
 
LVL 43

Accepted Solution

by:
ravenpl earned 30 total points
ID: 17153905
If the array is out of sync - You can't. If it is, just plug the disk - kernel will find new disk and use in array.

Or try assembling array from scratch
mdadm -A /dev/mdX -YourOptions -l5 -n4 /dev/sda7 /dev/sdb7 missing /dev/sdd7
but it will propably fail, if disks are unsyc.
0
 

Author Comment

by:brucepennypacker
ID: 17187474
You were close.  I just had to do the following:

mdadm --assemble  --force /dev/md5 /dev/sda7 /dev/sdb7 /dev/sdc7 /dev/sdd7

This has recreated the array successfully, still with one failed disk.  I was able to mount it, and after it recovered its journal I was able to copy all the data off before replacing the drive & rebuilding the array.
0

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The purpose of this article is to demonstrate how we can use conditional statements using Python.
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

690 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question