[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

RAID Rebuild Issue, bad block table is full unable to log block

Posted on 2013-11-04
2
Medium Priority
?
3,468 Views
Last Modified: 2013-11-06
I've found plenty of info talking about people having this problem, but I haven't found anyone saying what to do if you do. While this is not a critical server (used mostly for labs), it does have enough on it that I would *definitely* like to avoid rebuilding it. There won't be any actual data loss if I have to though, just time.

I believe I have logical bad blocks from a hard drive failing in a RAID 1 of my RAID 10. Similar situations from all my searching also reference it as a "punctured" array (but that also seems to be a vendor-specific term). The array won't rebuild.

Slot 1 started reporting predictive failures. We purchased an brand new drive of the same model (ST3500320NS), shutdown the server, replaced the drive, and booted it back up (non-production server). It reached 90% on the rebuild and started throwing unrecoverable media errors (slot 0 [remaining drive of the RAID 1] and slot 1 [the new drive] increment at about the same media error rate when this happens). We returned the drive for a new one, same issue at 90%. All cables have been re-fixed, just in case. A chkdsk with /f or /R hangs at the exact same file count of stage 4/5 every time (waited 1.5 hours with no movement).

MRM Errors:
- Controller ID: 0 Unrecoverable medium error during rebuild: PD 0 location 0x34c67621
- Controller ID: 0 Bad block table is full; unable to log block: PD = 0:1, Block = 0x34c67621

Equipment:

Server: Cisco UCS C200 M1
Controller: Intel ICH10R (integrated)
Disk Drives: 2x ST3500320NS  (Seagate SATA II 7200RPM 500GB 32MB Buffer))
RAID Configuration: RAID 10

Slot 0 and 1 are the slots for the mirror set in question (and the reference drives)


Operating System - Driver/MRM upgraded *after* issue started:
Operation System: Microsoft Windows Server 2008 R2 SP1 (patched as recent as 3 weeks)
Controller Driver: 15.0.2.2013.04.14 (previous 13.x)
Server Software: LSI MegaRaid Monitor 13.04.02.00 (previous 8.5.x)

Firmware latest HUU from Cisco after problem:
Current BIOS: C200.1.4.3k.0 (Build Date: 07/17/2013), (previous 1.4.3x, unsure but probably j)
CIMC: 1.4.3u (previous 1.4.3j)
0
Comment
Question by:DaveQuance
2 Comments
 
LVL 47

Accepted Solution

by:
David earned 1500 total points
ID: 39622574
If you want all of your data, you need to call in a pro, no way can I walk you through a recovery.

But to explain the situation, the surviving disk has unreadable blocks as well. You do have data loss.  Nothing can be done to get what the controller can't read from the disk.

Replace the "surviving" disk as well. data can't be read, so it is also screwed up.  Then build the RAID1 and restore from backup.

Root cause can be anything from bad luck to crappy power, but in the grand scheme of things, with those cheap disks and $2.00 embedded RAID controller and disks that cost about $35 in bulk, then this is what you need to expect.
0
 

Author Comment

by:DaveQuance
ID: 39622594
Understandable, it still boots and all the lab VMs work fine (it was a server we got at a MASSIVE bargain and served nicely for extra servers in larger lab environments). It's also made a cheap backup server too.

I'll likely just let it sit as is until the other drive fails or causes a problem significant to force a rebuild, then replace it and rebuild. Since either event results in a rebuild, and a sudden failure isn't a big problem, no use in forcing the work now.
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
Many businesses neglect disaster recovery and treat it as an after-thought. I can tell you first hand that data will be lost, hard drives die, servers will be hacked, and careless (or malicious) employees can ruin your data.
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …
Suggested Courses

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question