Solved

Linux-Raid - How do I SAFELY stop a linux-software raid rebuild?

Posted on 2010-11-30
6
862 Views
Last Modified: 2012-05-10
I had a drive go out in my raid on a linux system.

I issued the following command to start the rebuild:

mdadm /dev/md1 -r /dev/sdc1
[2010-11-30-09:43] Dgaustintx: mdadm /dev/md1 --add /dev/sdc1

It went along till around 2% and then I got this error:

linux57 kernel: Disabling IRQ #177

my /var/log/messages looks like this (last message received 1 minute ago):

Nov 30 12:56:33 linux57 kernel: sdc: Current: sense key: Aborted Command
Nov 30 12:56:33 linux57 kernel:     Additional sense: Scsi parity error
Nov 30 12:56:33 linux57 kernel: end_request: I/O error, dev sdc, sector 46495255
Nov 30 12:57:03 linux57 kernel: ata1: command 0x35 timeout, stat 0xd1 host_stat 0x24
Nov 30 12:57:03 linux57 kernel: ata1: status=0xd1 { Busy }
Nov 30 12:57:03 linux57 kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Nov 30 12:57:03 linux57 kernel: sdc: Current: sense key: Aborted Command
Nov 30 12:57:03 linux57 kernel:     Additional sense: Scsi parity error
Nov 30 12:57:03 linux57 kernel: end_request: I/O error, dev sdc, sector 46495263
Nov 30 12:57:03 linux57 kernel: ATA: abnormal status 0xD1 on port 0xACBF

So all I want to do at this point is to stop the rebuild SAFELY so that my sdd drive stays intact and I get my data off this system.

How do safely stop the raid - my uptime is between 17-18 and is normally 1-2

I do have access to a root prompt and can perform commands.

HELP-URGENT!

Thanks very much

0
Comment
Question by:dgintz1217
  • 3
  • 3
6 Comments
 
LVL 47

Accepted Solution

by:
dlethe earned 500 total points
ID: 34241804
Best you can do is let it complete.  All it is doing is telling you that it fixed an error on the disk you are rebuilding!
0
 

Author Closing Comment

by:dgintz1217
ID: 34249950
We survived this!
0
 
LVL 47

Expert Comment

by:dlethe
ID: 34252026
If you had to sweat .. then why not buy another disk, and migrate to a RAID6?

Not only does that protect you if you have a single drive failure, but in event you have a drive failure AND a bad block on one of the other disks, your RAID5 configuration will lose data, but a RAID6 config would move on w/o a hiccup.
Yes, a small performance hit, but in grand scheme of things, it is reasonable to get sleep at night.

Alternately, if you need performance, get faster HDDs.
0
NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

 

Author Comment

by:dgintz1217
ID: 34253335
so how does RAID6 differ from RAID5?
0
 
LVL 47

Expert Comment

by:dlethe
ID: 34253749
Basically, it is RAID5 + 1.

meaning you have to add 1 more disk drive, and then instead of 1 parity chunk being written to a drive, the same parity info is written to 2 disks.  No performance hit on reads, but a minor hit on writes (relative to what you have now).  Advantage is that you are protected against a total drive failure while rebuilding.   In these days when it can take DAYS to rebuild a RAID instead of hours, then you really need this, as you are protected against a drive failure even during the rebuild.

If you are RAID5 + a hot spare, then you really use the same hardware, and do a RAID6 with no hot spare.
0
 

Author Comment

by:dgintz1217
ID: 34253788
I'm RAID5 with no hot spare, BUT I have room to install a hot spare - if I did that would I be effectively RAID6?
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Facing problems with you memory card? Cannot access your memory card? All stored data, images, videos are lost? If these are your questions...than this small article might help you out in retrieving your lost or inaccessible data.
This tutorial will show how to configure a new Backup Exec 2012 server and move an existing database to that server with the use of the BEUtility. Install Backup Exec 2012 on the new server and apply all of the latest hotfixes and service packs. The…
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…

937 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

5 Experts available now in Live!

Get 1:1 Help Now