Solved

Linux-Raid - How do I SAFELY stop a linux-software raid rebuild?

Posted on 2010-11-30
6
858 Views
Last Modified: 2012-05-10
I had a drive go out in my raid on a linux system.

I issued the following command to start the rebuild:

mdadm /dev/md1 -r /dev/sdc1
[2010-11-30-09:43] Dgaustintx: mdadm /dev/md1 --add /dev/sdc1

It went along till around 2% and then I got this error:

linux57 kernel: Disabling IRQ #177

my /var/log/messages looks like this (last message received 1 minute ago):

Nov 30 12:56:33 linux57 kernel: sdc: Current: sense key: Aborted Command
Nov 30 12:56:33 linux57 kernel:     Additional sense: Scsi parity error
Nov 30 12:56:33 linux57 kernel: end_request: I/O error, dev sdc, sector 46495255
Nov 30 12:57:03 linux57 kernel: ata1: command 0x35 timeout, stat 0xd1 host_stat 0x24
Nov 30 12:57:03 linux57 kernel: ata1: status=0xd1 { Busy }
Nov 30 12:57:03 linux57 kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Nov 30 12:57:03 linux57 kernel: sdc: Current: sense key: Aborted Command
Nov 30 12:57:03 linux57 kernel:     Additional sense: Scsi parity error
Nov 30 12:57:03 linux57 kernel: end_request: I/O error, dev sdc, sector 46495263
Nov 30 12:57:03 linux57 kernel: ATA: abnormal status 0xD1 on port 0xACBF

So all I want to do at this point is to stop the rebuild SAFELY so that my sdd drive stays intact and I get my data off this system.

How do safely stop the raid - my uptime is between 17-18 and is normally 1-2

I do have access to a root prompt and can perform commands.

HELP-URGENT!

Thanks very much

0
Comment
Question by:dgintz1217
  • 3
  • 3
6 Comments
 
LVL 47

Accepted Solution

by:
dlethe earned 500 total points
ID: 34241804
Best you can do is let it complete.  All it is doing is telling you that it fixed an error on the disk you are rebuilding!
0
 

Author Closing Comment

by:dgintz1217
ID: 34249950
We survived this!
0
 
LVL 47

Expert Comment

by:dlethe
ID: 34252026
If you had to sweat .. then why not buy another disk, and migrate to a RAID6?

Not only does that protect you if you have a single drive failure, but in event you have a drive failure AND a bad block on one of the other disks, your RAID5 configuration will lose data, but a RAID6 config would move on w/o a hiccup.
Yes, a small performance hit, but in grand scheme of things, it is reasonable to get sleep at night.

Alternately, if you need performance, get faster HDDs.
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 

Author Comment

by:dgintz1217
ID: 34253335
so how does RAID6 differ from RAID5?
0
 
LVL 47

Expert Comment

by:dlethe
ID: 34253749
Basically, it is RAID5 + 1.

meaning you have to add 1 more disk drive, and then instead of 1 parity chunk being written to a drive, the same parity info is written to 2 disks.  No performance hit on reads, but a minor hit on writes (relative to what you have now).  Advantage is that you are protected against a total drive failure while rebuilding.   In these days when it can take DAYS to rebuild a RAID instead of hours, then you really need this, as you are protected against a drive failure even during the rebuild.

If you are RAID5 + a hot spare, then you really use the same hardware, and do a RAID6 with no hot spare.
0
 

Author Comment

by:dgintz1217
ID: 34253788
I'm RAID5 with no hot spare, BUT I have room to install a hot spare - if I did that would I be effectively RAID6?
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Usually shares are where we want them for our users and we tend to take them for granted. There are times, however, when those shares may disappear causing difficulty for your users. One of the first things to try is searching for files that shou…
VM backups can be lost due to a number of reasons: accidental backup deletion, backup file corruption, disk failure, lost or stolen hardware, malicious attack, or due to some other undesired and unpredicted event. Thus, having more than one copy of …
This tutorial will walk an individual through locating and launching the BEUtility application and how to execute it on the appropriate database. Log onto the server running the Backup Exec database. In a larger environment, this would generally be …
This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now