Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Raid-5 Array - rebuild gone wrong?

Posted on 2011-03-25
Last Modified: 2013-11-05
Hello all,

Last night I rebuilt my raid 5 array after one of the disks claimed "error" I took the following steps
Replaced offending disk with a similar one
Using the MediaShield utility in the BIOS (NVIDIA chipset mobo, hw configured RAID) I rebuilt the array.
Right here things get a little iffy. When I installed the new disk and went into my raid utility it showed two array, each with one of the disks from the 3-disk raid-5 array that had failed. Each had the option to rebuild by pressing 'r' but pressing 'r' would kick me back to the previous screen with no confirmation.

I deleted one of the arrays and kept the array that held my first disk in the series, then I hit rebuild and selected both my other disks (in order). In the operating system I can confirm that the MediaShield utility (software on OS not BIOS now) reports that my drive is rebuilding. I left it to run overnight and when I came in this morning it claimed it successfully rebuilt the array.

I try and open the drive which is now displayed in explorer, won't open. I open up the Disk Management snap-in and notice that 1) it is a 2-partition basic disk 2) one partition reports "unallocated" (one of the 1TB hdd's), the other partition reports RAW (another of the 1TB hdd's).

I have no doubt I can format them and have a functional array, but then my data would be kaput. Is there any way I can still recover my data? Where did I go wrong that I have this current setup?

Thank you,

Valde Edius
Question by:Valde_Edius
  • 3
  • 3
LVL 47

Accepted Solution

dlethe earned 500 total points
ID: 35216850
First RAID5 with NVIDIA ... you're just looking for trouble, and if you aren't using the enterprise-class SATA disks that have the proper firmware mods to deal with TLER, then you can pretty much expect to lose a lot of data, as it will never work right.

To get going, go to runtime.org download raid reconstructor.  free to try, pay to buy.    You will need a NON-RAID adapter to use it to see the individual disks.

But it is a lost cause if you continue to use that hardware combo, it will just happen again.  You can count on it.

Author Comment

ID: 35218797
Raid Reconstructor (RR) keeps giving me the error "This result is not significant" after the analysis. I did some research on it, and unless I can manually determine all the correct parameters, I would need to pay the makers of RR $300 USD to determine this information for me it seems. Any advice on how to manually determine block size, starting sector, and order?
LVL 47

Expert Comment

ID: 35218956
In a degraded condition it is difficult because you just cant verify if parity is OK, there is no parity.  So what you have to do is painful.  If this is confusing then pay them or somebody else.

1. Get a binary editor that can look at the physical blocks in hex & ASCII.
2. Programmers calculator that has XOR capability + ASCII table.

The technique is to look at border between each possible block size, i.e say block size is 4, then blocks 0-3  on each physical disk are a stripe, blocks 4-7 are a stripe, then so on.

Now you look for a string of ASCII text characters that will start on one stripe, and end on another.   This will tell you where the strips start and end so you know block size.  
Then you have to see patterns to get the drive order.

But issue is now that due to missing disks you have to calculate parity and ASCII lookup, and remap the missing drive.  you need to figure out the ordering left, to right, where the hole is.  

Then you also have some bad blocks so you have to take statistical samples.  

Bottom line, if Info I gave you doesn't make you say, piece-o-cake, I'll crank out a write a program to automate this, thanks .... then pay somebody.

The New “Normal” in Modern Enterprise Operations

DevOps for the modern enterprise offers many benefits — increased agility, productivity, and more, but digital transformation isn’t easy, especially if you’re not addressing the right issues. Register for the webinar to dive into the “new normal” for enterprise modern ops.


Author Comment

ID: 35219132
The data isn't worth the $300 to send it off to someone else, its all media. I can rescan my entire DVD collection again and all the music/pictures are backed up. However, since RR attempts to use a 'brute force' approach to figuring this info out, would it be worth it to try and plug in the old HDD that failed and put that back or is my parity already trash and that would produce false positives if anything? I found my block size to be 64k by repeating the process of creating the raid array because I know I used the default size which is 64k in this case. RR could not provide me with any positive even given that I can guarantee that and 1-2-3 ordering. The only thing that is an unknown is the start block.
LVL 21

Expert Comment

ID: 35220315
I have found Raid Reconstructor to be pretty effective at evaluation the configuration.  This is with limited (around 5 different RAID sets) experience, though.
I presume that it is doing pretty much what dlethe is suggesting as this shouldn't be too tough for a good programmer to accomplish.  RR appears to try all sorts of combinations and then see if any turn up with data that "makes sense".  In the one instance where it was unsuccessful for me, others were similarly unsuccessful.

If I read the original post correctly, there are three disks in the original array with one failed.  The fact that the original controller doesn't even see the two disks as part of the same array sounds pretty bad to me.  I'd bet that something happened during rebuild to trash one of the two original disks.
LVL 47

Expert Comment

ID: 35222815
The reason that RR fails to identify the configuration is a combination of the two.  I also somewhat oversimplified the algorithm out of professional courtesy as efficient algorithms to determine the topology are unpublished and considered proprietary intellectual property.  But I'll give you a little more and explain what is going on ...

1. Identifying topology is MUCH easier in non-degraded mode.  That is because you have redundant blocks, and can then utilize the parity blocks to insure that any given block of 512 bytes at offset #n has not been corrupted.   When you XOR the same block across the 5 disks, then every 512 x 8 bits will equal to all 1s or all 0s.    This provides a sanity check and tells you if you can trust the data in the first place.

W/o parity, then you must read a heck of a lot of data and take an average.

2. No way to determine if a stripe is looking at metadata or filesystem data w/o parity, unless they do some things that I won't get into.

3. W/O parity, then you can't easily identify the proper drive ordering, because it is difficult to figure out which stripe is the parity data for any given slice.   Parity moves from disk to disk, starting at an unknown disk # rotating to the next disk, going left or right, at the raid block size, also known.

Bottom line, the more sophisticated algorithms kick in here.   If you have the option, tell RR to search MUCH, MUCH longer, Several GB for example.   Or, if you are 100% sure of the drive ordering, and the blocksize, then just "teach it".    I.e.  Your raid controller should still be alive, so just look at the config and tell it the block size, and you certainly know which disk failed

You won't know the start/end of metadata, but if you teach it correctly, then figuring out where the filesystem partition begins by looking at the raw devices with a binary editor for the file system header.   You have a 60% probability of it being in a human-readable layout (i.e. not XORed reconstructed).  then if you get lucky, teach RR where the partition begins and you are recovered.

Author Closing Comment

ID: 35245126
Ended up unable to actually recover my data.

Featured Post

Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Microservice architecture adoption brings many advantages, but can add intricacy. Selecting the right orchestration tool is most important for business specific needs.
In this article we will learn how to backup a VMware farm using Nakivo Backup & Replication. In this tutorial we will install the software on a Windows 2012 R2 Server.
In this Micro Tutorial viewers will learn how to use Boot Corrector from Paragon Rescue Kit Free to identify and fix the boot problems of Windows 7/8/2012R2 etc. As an example is used Windows 2012R2 which lost its active partition flag (often happen…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question