Solved

Linux-Raid - How do I SAFELY stop a linux-software raid rebuild?

Posted on 2010-11-30
6
864 Views
Last Modified: 2012-05-10
I had a drive go out in my raid on a linux system.

I issued the following command to start the rebuild:

mdadm /dev/md1 -r /dev/sdc1
[2010-11-30-09:43] Dgaustintx: mdadm /dev/md1 --add /dev/sdc1

It went along till around 2% and then I got this error:

linux57 kernel: Disabling IRQ #177

my /var/log/messages looks like this (last message received 1 minute ago):

Nov 30 12:56:33 linux57 kernel: sdc: Current: sense key: Aborted Command
Nov 30 12:56:33 linux57 kernel:     Additional sense: Scsi parity error
Nov 30 12:56:33 linux57 kernel: end_request: I/O error, dev sdc, sector 46495255
Nov 30 12:57:03 linux57 kernel: ata1: command 0x35 timeout, stat 0xd1 host_stat 0x24
Nov 30 12:57:03 linux57 kernel: ata1: status=0xd1 { Busy }
Nov 30 12:57:03 linux57 kernel: SCSI error : <2 0 0 0> return code = 0x8000002
Nov 30 12:57:03 linux57 kernel: sdc: Current: sense key: Aborted Command
Nov 30 12:57:03 linux57 kernel:     Additional sense: Scsi parity error
Nov 30 12:57:03 linux57 kernel: end_request: I/O error, dev sdc, sector 46495263
Nov 30 12:57:03 linux57 kernel: ATA: abnormal status 0xD1 on port 0xACBF

So all I want to do at this point is to stop the rebuild SAFELY so that my sdd drive stays intact and I get my data off this system.

How do safely stop the raid - my uptime is between 17-18 and is normally 1-2

I do have access to a root prompt and can perform commands.

HELP-URGENT!

Thanks very much

0
Comment
Question by:dgintz1217
  • 3
  • 3
6 Comments
 
LVL 47

Accepted Solution

by:
dlethe earned 500 total points
ID: 34241804
Best you can do is let it complete.  All it is doing is telling you that it fixed an error on the disk you are rebuilding!
0
 

Author Closing Comment

by:dgintz1217
ID: 34249950
We survived this!
0
 
LVL 47

Expert Comment

by:dlethe
ID: 34252026
If you had to sweat .. then why not buy another disk, and migrate to a RAID6?

Not only does that protect you if you have a single drive failure, but in event you have a drive failure AND a bad block on one of the other disks, your RAID5 configuration will lose data, but a RAID6 config would move on w/o a hiccup.
Yes, a small performance hit, but in grand scheme of things, it is reasonable to get sleep at night.

Alternately, if you need performance, get faster HDDs.
0
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

 

Author Comment

by:dgintz1217
ID: 34253335
so how does RAID6 differ from RAID5?
0
 
LVL 47

Expert Comment

by:dlethe
ID: 34253749
Basically, it is RAID5 + 1.

meaning you have to add 1 more disk drive, and then instead of 1 parity chunk being written to a drive, the same parity info is written to 2 disks.  No performance hit on reads, but a minor hit on writes (relative to what you have now).  Advantage is that you are protected against a total drive failure while rebuilding.   In these days when it can take DAYS to rebuild a RAID instead of hours, then you really need this, as you are protected against a drive failure even during the rebuild.

If you are RAID5 + a hot spare, then you really use the same hardware, and do a RAID6 with no hot spare.
0
 

Author Comment

by:dgintz1217
ID: 34253788
I'm RAID5 with no hot spare, BUT I have room to install a hot spare - if I did that would I be effectively RAID6?
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Facing problems with you memory card? Cannot access your memory card? All stored data, images, videos are lost? If these are your questions...than this small article might help you out in retrieving your lost or inaccessible data.
In this Micro Tutorial viewers will learn how they can get their files copied out from their unbootable system without need to use recovery services. As an example non-bootable Windows 2012R2 installation is used which has boot problems.
To efficiently enable the rotation of USB drives for backups, storage pools need to be created. This way no matter which USB drive is installed, the backups will successfully write without any administrative intervention. Multiple USB devices need t…

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question