?
Solved

Disk array failure

Posted on 2012-08-16
12
Medium Priority
?
771 Views
Last Modified: 2012-08-26
I seem to have lost my file system and I miss it!

A while back mdadm e-mailed me that a disk had failed in my raid-5 device. Seconds later another email arrived that another disk had failed. With two out of six disks malfunctioning, obviously the array failed.

I ran some diagnostics on the hard drives and looked in logfiles and the only conclusion I could come up with was that the disk controller temporarilly had failed somehow (the two failed disk was on a separate controller).

I decided to reboot the system and see if it would all fix itself (I have alot of confidence in mdadm and XFS!). Unfortunately, it did not assemble. I then decided to manually start the array by issuing;
mdadm --create /dev/md1 --level=5 --chunk=128 --raid-devices=6 /dev/sd[abcd]3 /dev/sd[ef]1.

Open in new window


Device started. State: Clean, Degraded. The last drive (one of the failed) was listed as Failed, Spare. I removed it and re-added it. Array synced and told me it was clean.

When I mounted the system I was informed that there was no file system on the array. Syslog informed me "XFS: bad magic number"
xfs_check: /dev/md1 is not a valid XFS filesystem (unexpected SB magic number 0x494e81a4)
xfs_repair fails to find a superblock and keeps searching for a secondary with no apparent luck.

Not even xfs_irepair works and tells me; xfs_db: /dev/md1 is not a valid XFS filesystem (unexpected SB magic number 0x494e81a4).

I realise that most of you still reading this are about to tell me to give up and restore the backup and I would agree if it wasn't for the fact that I have no backup. Lots of valued media (photos etc) are gone and I want to make sure there is no other option before I permanently ruin the data still left on the disks.

The only hope I have left is an XFS header on block 0 on /dev/sdb3. It looks like this;
# dd if=/dev/sdb3 bs=512 count=1 2> /dev/null | hexdump -C
00000000  58 46 53 42 00 00 10 00  00 00 00 00 35 0d 19 e0  |XFSB........5...|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  9d 5c cb dd 05 70 43 ca  9b 64 da eb 53 ef fd 58  |.\...pC..d..S..X|
00000030  00 00 00 00 10 00 00 07  00 00 00 00 00 00 02 00  |................|
00000040  00 00 00 00 00 00 02 01  00 00 00 00 00 00 02 02  |................|
00000050  00 00 00 60 00 fe a5 60  00 00 00 36 00 00 00 00  |...`...`...6....|
00000060  00 00 80 00 bd b4 10 00  01 00 00 10 72 61 69 64  |............raid|
00000070  00 00 00 00 00 00 00 00  0c 0c 08 04 18 00 00 05  |................|
00000080  00 00 00 00 00 04 d0 00  00 00 00 00 00 00 0a ab  |................|
00000090  00 00 00 00 0c 37 22 92  00 00 00 00 00 00 00 00  |.....7".........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  00 00 00 00 00 00 00 02  00 00 00 20 00 00 00 60  |........... ...`|
000000c0  00 0c 10 00 00 00 10 00  00 00 00 08 00 00 00 08  |................|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Open in new window


I realised that mdadm had assembled the disks in a different order and sdb3 was now disk number 1 rather then 0. I stopped the device and issued another command;  
mdadm --create /dev/md1 --assume-clean --level=5 --chunk=128 --raid-devices=6 /dev/sdb3 /dev/sda3 /dev/sdd3 /dev/sdc3 /dev/sdf1 /dev/sde1

Open in new window


Now the array looks like this;
freeport:~# mdadm -D /dev/md1
/dev/md1:
        Version : 1.2
  Creation Time : Fri Aug 10 22:03:45 2012
     Raid Level : raid5
     Array Size : 3560198400 (3395.27 GiB 3645.64 GB)
  Used Dev Size : 712039680 (679.05 GiB 729.13 GB)
   Raid Devices : 6
  Total Devices : 6
    Persistence : Superblock is persistent

    Update Time : Fri Aug 10 22:03:45 2012
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 128K

           Name : freeport:1  (local to host freeport)
           UUID : cdbb881a:52ab1c71:72a71ce4:d8e4f1dc
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       19        0      active sync   /dev/sdb3
       1       8        3        1      active sync   /dev/sda3
       2       8       51        2      active sync   /dev/sdd3
       3       8       35        3      active sync   /dev/sdc3
       4       8       81        4      active sync   /dev/sdf1
       5       8       65        5      active sync   /dev/sde1

Open in new window


which is the same as before the incident.

Unfortunately, no file system is detected on the device despite the modification. Block 0 on the md device shows;
00000000  49 4e 81 a4 02 02 00 00  00 00 03 e8 00 00 03 e8  |IN..............|
00000010  00 00 00 01 00 00 00 00  00 00 00 00 00 00 00 04  |................|
00000020  4b f5 19 64 01 30 31 a7  4b 11 70 f3 00 00 00 00  |K..d.01.K.p.....|
00000030  4c 0c d0 1c 05 b9 81 7f  00 00 00 00 00 00 0e c4  |L...............|
00000040  00 00 00 00 00 00 00 01  00 00 00 00 00 00 00 01  |................|
00000050  00 00 00 02 00 00 00 00  00 00 00 00 f5 97 70 f0  |..............p.|
00000060  ff ff ff ff 00 00 00 00  00 00 00 00 00 04 00 00  |................|
00000070  cb 80 00 01 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  49 4e 81 a4 02 02 00 00  00 00 03 e8 00 00 03 e8  |IN..............|
00000110  00 00 00 01 00 00 00 00  00 00 00 00 00 00 00 04  |................|
00000120  4b f5 19 63 35 e0 f7 c0  4b 11 70 f3 00 00 00 00  |K..c5...K.p.....|
00000130  4c 0c d0 1c 05 b9 81 7f  00 00 00 00 00 00 8f 13  |L...............|
00000140  00 00 00 00 00 00 00 09  00 00 00 00 00 00 00 01  |................|
00000150  00 00 00 02 00 00 00 00  00 00 00 00 f5 97 70 f0  |..............p.|
00000160  ff ff ff ff 00 00 00 00  00 00 00 00 00 04 20 88  |.............. .|
00000170  3a 40 00 09 00 00 00 00  00 00 00 00 00 00 00 00  |:@..............|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Open in new window


while /dev/sdb3 still has the xfs header.

If i do a simple dd on the md device and pipe through strings I see text that belongs to documents I used to own but I am not sure that actually means anything.

This is where I ran out of ideas. I have most likely done something wrong along the line. All I ask is that the collective wistom that is this forum help me with either additional tests or ideas, or inform me it is gone. Cut your losses and move on!

Many thanks in advance,

  Mats
0
Comment
Question by:Hallonstedt
  • 3
  • 2
  • 2
  • +4
11 Comments
 
LVL 88

Expert Comment

by:rindi
ID: 38300426
I'm afraid you have lost your data. The mistake you made was to try rebuilding the array, that causes a resync and that replaces the contents of the drive you tried to re-add. You might have had more luck if you had used the assemble command which would have added the disk back to the array without rebuilding it.

I think the only small chance you would have would be to send the disks to a professional recovery agency, but that is probably prohibitively expensive.

At least you have probably learned from this experience that having RAID isn't a substitute for backups. Also, if your disks are large enough I'd suggest another array type, like RAID 6 or RAID 10, which have more redundancy and more disks can fail.
0
 
LVL 7

Expert Comment

by:multimac
ID: 38300927
rindi has it right.
Also you can send your disks to http://www.krollontrack.com/
they once restored our raid5 array, after some "genius" tried to increased the disk space without any backup and failed horrible.
0
 
LVL 4

Expert Comment

by:thegu99
ID: 38300958
yup rindi is right.
Even a pro recovery firm may not be able to get back your data as it appears that not only was a rebuild done, but the disk order has been changed and rebuilt again.  If a pro firm is able to reassemble your original RAID and recover some of your data, it will be quite expensive.
Sorry for your loss.  :(
0
Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

 
LVL 56

Expert Comment

by:andyalder
ID: 38302185
I'm not at all sure Rindi is right, it really depends on how long the "mdadm --create" process took. I few seconds means it wrote some superblocks to tell the OS how the new disk layout was to be but didn't write all over the disks. A few hours and that means it over-wrote all the data. I recommend leaving the thread open until dlethe posts although he'll probably tell you that even in case 1 he would charge an arm and a leg but could probably recover 99% of it.
0
 

Author Comment

by:Hallonstedt
ID: 38303920
The initial mdadm --create command never resulted in a sync. It immediately came online and mdadm -D informed me the array was clean but degraded. One of the two previously failed disks were set as spare.

I then tried to mount the file system before I removed the spare disk but already at this stage, there was no recognisable file system. This may, in retrospect, be because the disks were mounted in a different order. I never suspected that as I was under the impression mdadm kept a small database on the partitions to keep track of them rather then a very unreliable device name.

After this, I removed the spare drive thinking this was perhaps a standard configuration for mdadm when you have 6 disks and manually re-added it. This is where the first rebuild happens. Whilst I realise this may be the cause of the state I am in, I fail to understand why mdadm tells me I have a clean array when it obviously is badly broken. What does Clean really mean?

Anyways, thanks for your comments so far.
0
 
LVL 47

Accepted Solution

by:
David earned 1500 total points
ID: 38329298
Hallonstedt - Sorry, but you're looking at a professional recovery.  You forced a resync with stale data.  Even a professional recovery isn't going to be able to get files that are larger than a few hundred KB (depending on the chunk size and fs parameters), for any stripe that was recalculated -- but repairing the other disk, and using the XOR parity from it can help a professional extrapolate the data one needs to reconstruct)

Your only prayer is if this resync ran for only a few seconds or minutes, rather than doing the whole disk.  I doubt that happened.

Write it off as a 100% data loss unless you have around $5000 to spend on a recovery.  That is the going rate you can expect to pay.

Ontrack is the only vendor I would trust with this kind of recovery. They do provide free estimates.  Maybe you live right and the data was sparse enough and the data from the repaired HDD can be used to recover.  Get the free estimate.  So sorry.
0
 

Author Comment

by:Hallonstedt
ID: 38330950
I understand.

The only thing I am not fully aware of is what really happened;
why did mdadm tell me I had a clean raid array with all disks in sync when they obviously were mounted in the wrong order and the data was corrupt? If ther really were in sync no harm had been done when I added the disk it decided to consider a spare.
0
 
LVL 47

Expert Comment

by:David
ID: 38331086
mdadm can deal with disks in wrong order.  It puts a signature on them.  The "order" can change all the time if the disks are on a SAN, so it doesn't care about such things.  In fact, there technically isn't a wrong or right order.  If there was, you could never have a hot-swap system.


As for bad blocks, this also can and will happen all the time. Disks have hundreds of thousands of spare sectors.  Manufacturers expect this to happen.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 38331407
The only way you can be sure of what really happened is to deliberately do the same thing again and document what you did.
0
 
LVL 41

Expert Comment

by:noci
ID: 38332610
you create NEW arrays with --create which writes new disk headers on the disk.
To normaly use an array  you assemble arrays using --assemble.

The Home block is most probably overwritten in the process, but NOT the entire disk.
[ Although parity information might be modified.

You really need some professional help. [ If the order was the same before & after the create the parity info will not overwrite any live data. If the order is different well all bets are off. ].

If you have an old boot log [ from when the last time the system booted correctly] that might help determining the right order. The mdadm mail might be helpful too.
And dont do anything to the disks yourself. A professional fit MIGHT be able to repair but only if the drivers are as close to the state at the time of failure as possible...
And any thing else but reading the disk will most probably do more harm then good.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 38334590
I'd have said noci's post was far more useful as answer.
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The question appears often enough, how do I transfer my data from my old server to the new server while preserving file shares, share permissions, and NTFS permisions.  Here are my tips for handling such a transfer.
Compliance and data security require steps be taken to prevent unauthorized users from copying data.  Here's one method to prevent data theft via USB drives (and writable optical media).
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …
Suggested Courses
Course of the Month16 days, left to enroll

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question