Data recovery from possibly corrupted RAID5 LVM2 ext4 filesystem

angeloio
angeloio used Ask the Experts™
on
OK. First of all lets set the scene:

In the Intel SS4200 NAS box 4 drives have been installed (2TB each) in a RAID5 architecture. Worked for a while as a samba server, then problems started with the hardware. We decided to change hardware completely.

So: I built an ubuntu 9.04 server on an intel motherboard. I used one ATA drive for the root filesystem and the 4 PREVIOUS HDDs each one connected to each sata controller.
uname -a reports
Linux NAS 2.6.28-17-server #58-Ubuntu SMP Tue Dec 1 19:58:28 UTC 2009 i686 GNU/Linux
Also:
#cat lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=9.04
DISTRIB_CODENAME=jaunty
DISTRIB_DESCRIPTION="Ubuntu 9.04"

The RAID5 architecture was detected and rebuilt.
Now  /sbin/mdadm --detail /dev/md0 reports
root@NAS:/etc#  /sbin/mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90
  Creation Time : Wed Jan 27 22:06:31 2010
     Raid Level : raid5
     Array Size : 5860535808 (5589.04 GiB 6001.19 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Feb  1 00:11:26 2010
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 42dcb4dd:20227bfb:cced5de7:ca715931 (local to host NAS)
         Events : 0.44

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1
       3       8       65        3      active sync   /dev/sde1

(NOTE: The physical order of the drives have been changed since the mobo itself has been changed. However I assume that since the Superblock is persistent this did NOT corrupt the data...Please correct me if I am wrong with this....)

Now I discovered that  Intel SS4200 box has probably installed an lvm2 volume on top of raid:

root@NAS:/etc# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "md0Container" using metadata type lvm2

and

root@NAS:/etc# lvs
  LV        VG           Attr   LSize Origin Snap%  Move Log Copy%  Convert
  md0Region md0Container -wi-a- 5.46T

root@NAS:/etc# pvs
  PV         VG           Fmt  Attr PSize PFree
  /dev/md0   md0Container lvm2 a-   5.46T    0


(However:
root@NAS:/etc# fdisk -l /dev/md0

Disk /dev/md0: 6001.1 GB, 6001188667392 bytes
2 heads, 4 sectors/track, 1465133952 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000

Disk /dev/md0 doesn't contain a valid partition table
...I don't know if this constitutes a problem or not....)

Now I tried to mount the lvm2 volume:
root@NAS:/etc# mount /dev/md0Container/md0Region /mnt
mount: wrong fs type, bad option, bad superblock on /dev/mapper/md0Container-md0Region,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
dmesg reports
[97848.417567] EXT4-fs warning (device dm-0): ext4_fill_super: extents feature not enabled on this filesystem, use tune2fs.
[97848.417574]
[97848.417577] EXT4-fs: dm-0: couldn't mount because of unsupported optional features (2000000).

I tried to
tune2fs -O ^extents /dev/md0Container/md0Region ( I don't know if this is the correct command or not...)
But
root@NAS:/etc# tune2fs -l /dev/md0Container/md0Region
tune2fs 1.41.9 (22-Aug-2009)
tune2fs: Filesystem revision too high while trying to open /dev/md0Container/md0Region
Couldn't find valid filesystem superblock.

I tried almost everything:
mke2fs -t ext4 -n /dev/md0Container/md0Region
mke2fs 1.41.9 (22-Aug-2009)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
366288896 inodes, 1465131008 blocks
73256550 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
44713 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848, 512000000, 550731776, 644972544
 and then
root@NAS:/etc# e2fsck -b 98304 /dev/md0Container/md0Region
e2fsck 1.41.9 (22-Aug-2009)
e2fsck: Bad magic number in super-block while trying to open /dev/md0Container/md0Region

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

THIS HAPPENS WITH ALL BLOCKS !!!

I TRIED EVERYTHING I KNOW:
root@NAS:/etc# dumpe2fs /dev/md0Container/md0Region
dumpe2fs 1.41.9 (22-Aug-2009)
dumpe2fs: Filesystem revision too high while trying to open /dev/md0Container/md0Region
Couldn't find valid filesystem superblock.

...and I don't know any other way to recover the freaking volume.
What I would like to do is somehow to recover the files that have been written to the volume when the disks were still on Intel SS4200 NAS box.

Any help would be greatly appreciated.

Many Thanks to all!
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
_

Commented:
The only thing I can think of is to try Raid Reconstructor. Free to try, and if it can get the data, then you need to buy.
http://www.runtime.org/raid.htm

If you don't have a Windows system to hook the drives to, there is a link on how to make a bootcd near the bottom.

Commented:
Testdisk may be worth a try:  http://www.cgsecurity.org/wiki/TestDisk
Software Engineer
Distinguished Expert 2018
Commented:
A md device is considered a partition.  ==> it has no partition tables.
so the fdisk SHOULD fail.

The first poster probably missed that this is a linux raid solution and not a windows/bios based one. I am not sure that the raid reconstructor understands the disk layout.

There are two flavours of ext4, the plain ext4 and a 'compatibility one' called ext4dev. The ext4dev is used by people that started with ext4 when it still was "development" code.

Also the mount complains about 'extents' ...
Classic ext2/ext3 stores lists of blocks in a cascading tree of blocknumbers.
If extends are enabled, a list of regions on disks is kept (not a list of blocknumbers, but of blocknumber/blockcount pairs.

Too old versions of Linux probably don't known the new concept. But it is a very BASIC difference making the file system incompatible for older Linux versions.
And you cant just turn the facility off, you need to replace all extent lists with blocklists...

Then:
After mke2fs  I doubt if you see any data on the disk. mk2efs is a filesystem formater, i.e. it makes an empty storage structure on your disk... effectively removing everything.
(Data is that isn't overwritten is still there but all control structures are setup empty.)

If you want to start out from scratch: (empty disks...)
dd if=/dev/zero of=/dev/sdxx bs=128K count=1

clears all data from the partition table on the sdxx device & should also hit the first block of the filesystems involved.
Then you should be able to reconstruct a new raidset and add all the disks again. using mdadm.
Success in ‘20 With a Profitable Pricing Strategy

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Author

Commented:
Dear noci,

Many many thanks for your detailed answer!
I did not understand one thing though: Is it the linux version (Ubuntu 9.04) that does not understand the 'extents' feature of the filesystem ? What I mean is that should I upgrade a module or kernel so that the extents feature is recognized ? And according to your (vast as it seems) experience is this the problem that I cannot mount the filesystem ?
Please keep in mind that my priority here is to recover the data. After that I can do whatever I want with the volume/filesystem...

Many thanks again for your help
Regards
Ioannis
nociSoftware Engineer
Distinguished Expert 2018

Commented:
Yes it's an advanced feature, so you would need a recent kernel ext4 support came with 2.6.29 by default as a supported in stead of development fs.
And as noted before, there is a small but significant difference between ext4 & ext4dev.

If you get a recent kernel (2.6.31 for example) that does have all the options
it should work.  You can build it from source if needed for a vanilla kernel.
I just hope you didn't wreck it with the mke2fs.
(A newer kernel probably also has some wrinkles ironed out)
_

Commented:
>> I am not sure that the raid reconstructor understands the disk layout.

fwiw... from their FAQ -  http://www.runtime.org/raid-reconstructor-faq.htm#filesystem
"RAID Reconstructor can reconstruct any file system provided the array is not of a proprietary order and the start sectors are the same across all the drives in the array."

Author

Commented:
Filesystem was corrupted beyond all recovery possibly due to multiple hot reboots when the system  SS4200 NAS was live. I did not know that a hot reboot could do this to a RAID system but other forum posts agree that it can even with journaling filesystems like ext3/ext4. SS4200 NAS does NOT have an ext4 BUT an ext3 filesystem natively but due to corruption Ubuntu detected it as ext4.

Thanks to all for the responses

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial