We help IT Professionals succeed at work.

Disk array failure

Hallonstedt
Hallonstedt asked
on
Medium Priority
798 Views
Last Modified: 2012-08-26
I seem to have lost my file system and I miss it!

A while back mdadm e-mailed me that a disk had failed in my raid-5 device. Seconds later another email arrived that another disk had failed. With two out of six disks malfunctioning, obviously the array failed.

I ran some diagnostics on the hard drives and looked in logfiles and the only conclusion I could come up with was that the disk controller temporarilly had failed somehow (the two failed disk was on a separate controller).

I decided to reboot the system and see if it would all fix itself (I have alot of confidence in mdadm and XFS!). Unfortunately, it did not assemble. I then decided to manually start the array by issuing;
mdadm --create /dev/md1 --level=5 --chunk=128 --raid-devices=6 /dev/sd[abcd]3 /dev/sd[ef]1.

Open in new window


Device started. State: Clean, Degraded. The last drive (one of the failed) was listed as Failed, Spare. I removed it and re-added it. Array synced and told me it was clean.

When I mounted the system I was informed that there was no file system on the array. Syslog informed me "XFS: bad magic number"
xfs_check: /dev/md1 is not a valid XFS filesystem (unexpected SB magic number 0x494e81a4)
xfs_repair fails to find a superblock and keeps searching for a secondary with no apparent luck.

Not even xfs_irepair works and tells me; xfs_db: /dev/md1 is not a valid XFS filesystem (unexpected SB magic number 0x494e81a4).

I realise that most of you still reading this are about to tell me to give up and restore the backup and I would agree if it wasn't for the fact that I have no backup. Lots of valued media (photos etc) are gone and I want to make sure there is no other option before I permanently ruin the data still left on the disks.

The only hope I have left is an XFS header on block 0 on /dev/sdb3. It looks like this;
# dd if=/dev/sdb3 bs=512 count=1 2> /dev/null | hexdump -C
00000000  58 46 53 42 00 00 10 00  00 00 00 00 35 0d 19 e0  |XFSB........5...|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  9d 5c cb dd 05 70 43 ca  9b 64 da eb 53 ef fd 58  |.\...pC..d..S..X|
00000030  00 00 00 00 10 00 00 07  00 00 00 00 00 00 02 00  |................|
00000040  00 00 00 00 00 00 02 01  00 00 00 00 00 00 02 02  |................|
00000050  00 00 00 60 00 fe a5 60  00 00 00 36 00 00 00 00  |...`...`...6....|
00000060  00 00 80 00 bd b4 10 00  01 00 00 10 72 61 69 64  |............raid|
00000070  00 00 00 00 00 00 00 00  0c 0c 08 04 18 00 00 05  |................|
00000080  00 00 00 00 00 04 d0 00  00 00 00 00 00 00 0a ab  |................|
00000090  00 00 00 00 0c 37 22 92  00 00 00 00 00 00 00 00  |.....7".........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  00 00 00 00 00 00 00 02  00 00 00 20 00 00 00 60  |........... ...`|
000000c0  00 0c 10 00 00 00 10 00  00 00 00 08 00 00 00 08  |................|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Open in new window


I realised that mdadm had assembled the disks in a different order and sdb3 was now disk number 1 rather then 0. I stopped the device and issued another command;  
mdadm --create /dev/md1 --assume-clean --level=5 --chunk=128 --raid-devices=6 /dev/sdb3 /dev/sda3 /dev/sdd3 /dev/sdc3 /dev/sdf1 /dev/sde1

Open in new window


Now the array looks like this;
freeport:~# mdadm -D /dev/md1
/dev/md1:
        Version : 1.2
  Creation Time : Fri Aug 10 22:03:45 2012
     Raid Level : raid5
     Array Size : 3560198400 (3395.27 GiB 3645.64 GB)
  Used Dev Size : 712039680 (679.05 GiB 729.13 GB)
   Raid Devices : 6
  Total Devices : 6
    Persistence : Superblock is persistent

    Update Time : Fri Aug 10 22:03:45 2012
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 128K

           Name : freeport:1  (local to host freeport)
           UUID : cdbb881a:52ab1c71:72a71ce4:d8e4f1dc
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       19        0      active sync   /dev/sdb3
       1       8        3        1      active sync   /dev/sda3
       2       8       51        2      active sync   /dev/sdd3
       3       8       35        3      active sync   /dev/sdc3
       4       8       81        4      active sync   /dev/sdf1
       5       8       65        5      active sync   /dev/sde1

Open in new window


which is the same as before the incident.

Unfortunately, no file system is detected on the device despite the modification. Block 0 on the md device shows;
00000000  49 4e 81 a4 02 02 00 00  00 00 03 e8 00 00 03 e8  |IN..............|
00000010  00 00 00 01 00 00 00 00  00 00 00 00 00 00 00 04  |................|
00000020  4b f5 19 64 01 30 31 a7  4b 11 70 f3 00 00 00 00  |K..d.01.K.p.....|
00000030  4c 0c d0 1c 05 b9 81 7f  00 00 00 00 00 00 0e c4  |L...............|
00000040  00 00 00 00 00 00 00 01  00 00 00 00 00 00 00 01  |................|
00000050  00 00 00 02 00 00 00 00  00 00 00 00 f5 97 70 f0  |..............p.|
00000060  ff ff ff ff 00 00 00 00  00 00 00 00 00 04 00 00  |................|
00000070  cb 80 00 01 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  49 4e 81 a4 02 02 00 00  00 00 03 e8 00 00 03 e8  |IN..............|
00000110  00 00 00 01 00 00 00 00  00 00 00 00 00 00 00 04  |................|
00000120  4b f5 19 63 35 e0 f7 c0  4b 11 70 f3 00 00 00 00  |K..c5...K.p.....|
00000130  4c 0c d0 1c 05 b9 81 7f  00 00 00 00 00 00 8f 13  |L...............|
00000140  00 00 00 00 00 00 00 09  00 00 00 00 00 00 00 01  |................|
00000150  00 00 00 02 00 00 00 00  00 00 00 00 f5 97 70 f0  |..............p.|
00000160  ff ff ff ff 00 00 00 00  00 00 00 00 00 04 20 88  |.............. .|
00000170  3a 40 00 09 00 00 00 00  00 00 00 00 00 00 00 00  |:@..............|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Open in new window


while /dev/sdb3 still has the xfs header.

If i do a simple dd on the md device and pipe through strings I see text that belongs to documents I used to own but I am not sure that actually means anything.

This is where I ran out of ideas. I have most likely done something wrong along the line. All I ask is that the collective wistom that is this forum help me with either additional tests or ideas, or inform me it is gone. Cut your losses and move on!

Many thanks in advance,

  Mats
Comment
Watch Question

CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
I'm afraid you have lost your data. The mistake you made was to try rebuilding the array, that causes a resync and that replaces the contents of the drive you tried to re-add. You might have had more luck if you had used the assemble command which would have added the disk back to the array without rebuilding it.

I think the only small chance you would have would be to send the disks to a professional recovery agency, but that is probably prohibitively expensive.

At least you have probably learned from this experience that having RAID isn't a substitute for backups. Also, if your disks are large enough I'd suggest another array type, like RAID 6 or RAID 10, which have more redundancy and more disks can fail.

Commented:
rindi has it right.
Also you can send your disks to http://www.krollontrack.com/
they once restored our raid5 array, after some "genius" tried to increased the disk space without any backup and failed horrible.

Commented:
yup rindi is right.
Even a pro recovery firm may not be able to get back your data as it appears that not only was a rebuild done, but the disk order has been changed and rebuilt again.  If a pro firm is able to reassemble your original RAID and recover some of your data, it will be quite expensive.
Sorry for your loss.  :(
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
I'm not at all sure Rindi is right, it really depends on how long the "mdadm --create" process took. I few seconds means it wrote some superblocks to tell the OS how the new disk layout was to be but didn't write all over the disks. A few hours and that means it over-wrote all the data. I recommend leaving the thread open until dlethe posts although he'll probably tell you that even in case 1 he would charge an arm and a leg but could probably recover 99% of it.

Author

Commented:
The initial mdadm --create command never resulted in a sync. It immediately came online and mdadm -D informed me the array was clean but degraded. One of the two previously failed disks were set as spare.

I then tried to mount the file system before I removed the spare disk but already at this stage, there was no recognisable file system. This may, in retrospect, be because the disks were mounted in a different order. I never suspected that as I was under the impression mdadm kept a small database on the partitions to keep track of them rather then a very unreliable device name.

After this, I removed the spare drive thinking this was perhaps a standard configuration for mdadm when you have 6 disks and manually re-added it. This is where the first rebuild happens. Whilst I realise this may be the cause of the state I am in, I fail to understand why mdadm tells me I have a clean array when it obviously is badly broken. What does Clean really mean?

Anyways, thanks for your comments so far.
President
CERTIFIED EXPERT
Top Expert 2010
Commented:
Hallonstedt - Sorry, but you're looking at a professional recovery.  You forced a resync with stale data.  Even a professional recovery isn't going to be able to get files that are larger than a few hundred KB (depending on the chunk size and fs parameters), for any stripe that was recalculated -- but repairing the other disk, and using the XOR parity from it can help a professional extrapolate the data one needs to reconstruct)

Your only prayer is if this resync ran for only a few seconds or minutes, rather than doing the whole disk.  I doubt that happened.

Write it off as a 100% data loss unless you have around $5000 to spend on a recovery.  That is the going rate you can expect to pay.

Ontrack is the only vendor I would trust with this kind of recovery. They do provide free estimates.  Maybe you live right and the data was sparse enough and the data from the repaired HDD can be used to recover.  Get the free estimate.  So sorry.

Author

Commented:
I understand.

The only thing I am not fully aware of is what really happened;
why did mdadm tell me I had a clean raid array with all disks in sync when they obviously were mounted in the wrong order and the data was corrupt? If ther really were in sync no harm had been done when I added the disk it decided to consider a spare.
DavidPresident
CERTIFIED EXPERT
Top Expert 2010

Commented:
mdadm can deal with disks in wrong order.  It puts a signature on them.  The "order" can change all the time if the disks are on a SAN, so it doesn't care about such things.  In fact, there technically isn't a wrong or right order.  If there was, you could never have a hot-swap system.


As for bad blocks, this also can and will happen all the time. Disks have hundreds of thousands of spare sectors.  Manufacturers expect this to happen.
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
The only way you can be sure of what really happened is to deliberately do the same thing again and document what you did.
nociSoftware Engineer
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
you create NEW arrays with --create which writes new disk headers on the disk.
To normaly use an array  you assemble arrays using --assemble.

The Home block is most probably overwritten in the process, but NOT the entire disk.
[ Although parity information might be modified.

You really need some professional help. [ If the order was the same before & after the create the parity info will not overwrite any live data. If the order is different well all bets are off. ].

If you have an old boot log [ from when the last time the system booted correctly] that might help determining the right order. The mdadm mail might be helpful too.
And dont do anything to the disks yourself. A professional fit MIGHT be able to repair but only if the drivers are as close to the state at the time of failure as possible...
And any thing else but reading the disk will most probably do more harm then good.
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
I'd have said noci's post was far more useful as answer.

Explore More ContentExplore courses, solutions, and other research materials related to this topic.