Solved

Problem with JFS superblocks - not in the standard location?

Posted on 2010-08-16
16
1,324 Views
Last Modified: 2012-05-10
Brief setup info...

Ubuntu 9.10
/dev/md3 is a raid5 array of 5x 500GB disks. total size is 1.8TB. It has a jfs file system on it.

I cannot get the jfs filesystem to be recognised by any of the tools.... mount, fsck.jfs, jfs_debugfs, jfs_tune

From the fsck.jfs output it seems like it cannot find the superblocks. If i view the disk in lde (linux disk editor) it can find the superblocks manually, but they don't appear to be in the usual places.

Assuming the normal 4k blocksize...

The first superblock is at block 24 instead of the usual block 1
The second superblock is at block 31 (the correct place)

But jfs_debugfs cannot find either of them, and is also showing an Aggregate Block Size: 256 for some reason.

I beleive that data itself is intact as I can find large chunks of it further up /dev/md3 via lde.

Any help getting this filesystem mounted again would be greatly appreciated.
$ sudo fsck.jfs /dev/md3

fsck.jfs version 1.1.12, 24-Aug-2007
processing started: 8/16/2010 18.23.12
Using default parameter: -p
The current device is:  /dev/md3

The superblock does not describe a correct jfs file system.

If device /dev/md3 is valid and contains a jfs file system,
then both the primary and secondary superblocks are corrupt
and cannot be repaired, and fsck cannot continue.

Otherwise, make sure the entered device /dev/md3 is correct.

Open in new window

jfs_debugfs version 1.1.12, 24-Aug-2007

Aggregate Block Size: 256

> su p
[1] s_magic:            '    '          [15] s_ait2.addr1:      0xff
[2] s_version:          0               [16] s_ait2.addr2:      0xffffffff
[3] s_size:     0x0000000000008000           s_ait2.address:    1099511627775
[4] s_bsize:            256             [17] s_logdev:          0xffffffff
[5] s_l2bsize:          8               [18] s_logserial:       0xffffffff
[6] s_l2bfactor:        0               [19] s_logpxd.len:      16777215
[7] s_pbsize:           85              [20] s_logpxd.addr1:    0xff
[8] s_l2pbsize:         4               [21] s_logpxd.addr2:    0xffffffff
[9] pad:                Not Displayed        s_logpxd.address:  1099511627775
[10] s_agsize:          0xffffff05      [22] s_fsckpxd.len:     16777215
[11] s_flag:            0xffffffff      [23] s_fsckpxd.addr1:   0xff
                JFS_OS2 JFS_LINUX       [24] s_fsckpxd.addr2:   0xffffffff
        JFS_COMMIT      JFS_GROUPCOMMIT      s_fsckpxd.address: 1099511627775
        JFS_LAZYCOMMIT  JFS_INLINELOG   [25] s_time.tv_sec:     0xffffffff
        JFS_BAD_SAIT    JFS_SPARSE      [26] s_time.tv_nsec:    0xffffffff
        DASD_ENABLED    DASD_PRIME      [27] s_fpack:           '¦¦¦¦¦¦¦¦¦¦¦'
[12] s_state:           0xffffffff
        Unknown State
[13] s_compress:        -1
[14] s_ait2.len:        16777215

display_super: [m]odify or e[x]it: x
> su s
[1] s_magic:            '    '          [15] s_ait2.addr1:      0xff
[2] s_version:          0               [16] s_ait2.addr2:      0xffffffff
[3] s_size:     0x0000000000016000           s_ait2.address:    1099511627775
[4] s_bsize:            256             [17] s_logdev:          0xffffffff
[5] s_l2bsize:          8               [18] s_logserial:       0xffffffff
[6] s_l2bfactor:        0               [19] s_logpxd.len:      16777215
[7] s_pbsize:           85              [20] s_logpxd.addr1:    0xff
[8] s_l2pbsize:         4               [21] s_logpxd.addr2:    0xffffffff
[9] pad:                Not Displayed        s_logpxd.address:  1099511627775
[10] s_agsize:          0xffffff05      [22] s_fsckpxd.len:     16777215
[11] s_flag:            0xffffffff      [23] s_fsckpxd.addr1:   0xff
                JFS_OS2 JFS_LINUX       [24] s_fsckpxd.addr2:   0xffffffff
        JFS_COMMIT      JFS_GROUPCOMMIT      s_fsckpxd.address: 1099511627775
        JFS_LAZYCOMMIT  JFS_INLINELOG   [25] s_time.tv_sec:     0xffffffff
        JFS_BAD_SAIT    JFS_SPARSE      [26] s_time.tv_nsec:    0xffffffff
        DASD_ENABLED    DASD_PRIME      [27] s_fpack:           '¦¦¦¦¦¦¦¦¦¦¦'
[12] s_state:           0xffffffff
        Unknown State
[13] s_compress:        -1
[14] s_ait2.len:        16777215

display_super: [m]odify or e[x]it: x

Open in new window

superblocks.bmp
0
Comment
Question by:davepusey
  • 11
  • 5
16 Comments
 
LVL 76

Expert Comment

by:arnold
ID: 33448175
if you can identify the block where the superblock is, you can run
fsck.jfs -b <block_where_you_found_the-supper_block> /dev/md3

did you check whether the RAID 5 device is in a good state?
more /proc/mdstat
to see where the other possible versions of the superblock might be found, run in test mode
Check to make sure that the -n option is available to you.  
mkfs.jfs  -n /dev/md3

The output will list all the location where a copy of the superblock could be found.
0
 
LVL 2

Author Comment

by:davepusey
ID: 33448720
Yes the raid array is good.

Neither of those fsck.jfs options worked.

Any idea why jfs_debugfs gives "Aggregate Block Size: 256" which is clearly wrong, whereas if u run it against a new jfs partition it gives the correct value of 4096.

Could this be the problem? Are the jfsutils using an incorrect block size of 256?
$ cat /proc/mdstat | grep -A 2 md3

md3 : active raid5 sdf1[4] sde1[3] sdd1[2] sdc3[1] sdb3[0]
      1894834176 blocks level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]

$ sudo fsck.jfs -b 24 /dev/md3

fsck.jfs version 1.1.12, 24-Aug-2007
processing started: 8/16/2010 20.12.47
fsck.jfs: invalid option -- 'b'

$ sudo fsck.jfs -n /dev/md3

fsck.jfs version 1.1.12, 24-Aug-2007
processing started: 8/16/2010 20.13.15
The current device is:  /dev/md3

The superblock does not describe a correct jfs file system.

If device /dev/md3 is valid and contains a jfs file system,
then both the primary and secondary superblocks are corrupt
and cannot be repaired, and fsck cannot continue.

Otherwise, make sure the entered device /dev/md3 is correct.

Open in new window

0
 
LVL 2

Author Comment

by:davepusey
ID: 33448780
Hmm this is interesting...

Newly created jfs filesystem on a spare disk... superblocks start at 0x8000 and 0xF000

On /dev/md3... superblocks are at 0x18000 and 0x1F000

Where has that extra 1 come from?
0
 
LVL 2

Author Comment

by:davepusey
ID: 33448896
Just going to do a read-only run of fsck.jfs via /dev/loop0 by using...

sudo losetup -r -o 65536 /dev/loop0 /dev/md3

I've got jfs_debugfs to read the superblocks this way.
0
 
LVL 2

Author Comment

by:davepusey
ID: 33448948
Better but still not working...

$ sudo jfs_debugfs /dev/loop0

jfs_debugfs version 1.1.12, 24-Aug-2007

Aggregate Block Size: 4096

> su p
[1] s_magic:            'JFS1'          [15] s_ait2.addr1:      0x00
[2] s_version:          1               [16] s_ait2.addr2:      0x000043a6
[3] s_size:     0x00000000e1defaa0           s_ait2.address:    17318
[4] s_bsize:            4096            [17] s_logdev:          0x00000903
[5] s_l2bsize:          12              [18] s_logserial:       0x00000144
[6] s_l2bfactor:        3               [19] s_logpxd.len:      8192
[7] s_pbsize:           512             [20] s_logpxd.addr1:    0x00
[8] s_l2pbsize:         9               [21] s_logpxd.addr2:    0x1c3c1800
[9] pad:                Not Displayed        s_logpxd.address:  473700352
[10] s_agsize:          0x00400000      [22] s_fsckpxd.len:     14508
[11] s_flag:            0x10200900      [23] s_fsckpxd.addr1:   0x00
                        JFS_LINUX       [24] s_fsckpxd.addr2:   0x1c3bdf54
        JFS_COMMIT      JFS_GROUPCOMMIT      s_fsckpxd.address: 473685844
                        JFS_INLINELOG   [25] s_time.tv_sec:     0x49b434fb
                                        [26] s_time.tv_nsec:    0x00000000
                                        [27] s_fpack:           ''
[12] s_state:           0x00000001
             FM_MOUNT
[13] s_compress:        0
[14] s_ait2.len:        4

display_super: [m]odify or e[x]it: x
> su s
[1] s_magic:            'JFS1'          [15] s_ait2.addr1:      0x00
[2] s_version:          1               [16] s_ait2.addr2:      0x000043a6
[3] s_size:     0x00000000e1defaa0           s_ait2.address:    17318
[4] s_bsize:            4096            [17] s_logdev:          0x00000903
[5] s_l2bsize:          12              [18] s_logserial:       0x00000143
[6] s_l2bfactor:        3               [19] s_logpxd.len:      8192
[7] s_pbsize:           512             [20] s_logpxd.addr1:    0x00
[8] s_l2pbsize:         9               [21] s_logpxd.addr2:    0x1c3c1800
[9] pad:                Not Displayed        s_logpxd.address:  473700352
[10] s_agsize:          0x00400000      [22] s_fsckpxd.len:     14508
[11] s_flag:            0x10200900      [23] s_fsckpxd.addr1:   0x00
                        JFS_LINUX       [24] s_fsckpxd.addr2:   0x1c3bdf54
        JFS_COMMIT      JFS_GROUPCOMMIT      s_fsckpxd.address: 473685844
                        JFS_INLINELOG   [25] s_time.tv_sec:     0x49b434fb
                                        [26] s_time.tv_nsec:    0x00000000
                                        [27] s_fpack:           ''
[12] s_state:           0x00000000
             FM_CLEAN
[13] s_compress:        0
[14] s_ait2.len:        4

display_super: [m]odify or e[x]it: x
> quit

$ sudo fsck.jfs -n /dev/loop0

fsck.jfs version 1.1.12, 24-Aug-2007
processing started: 8/16/2010 20.36.23
The current device is:  /dev/loop0
Superblock is corrupt and cannot be repaired
since both primary and secondary copies are corrupt.

 CANNOT CONTINUE.

Open in new window

0
 
LVL 2

Author Comment

by:davepusey
ID: 33448995
The offset is the same as the chunk size of the array.

I've just finished rebuilding the array, could one of the disks be out of sequence? How would i check this?
0
 
LVL 76

Expert Comment

by:arnold
ID: 33449077
Do you use an external journaling device for the JFS filesystem?

You could reference -j journaling_device -f -p /dev/md3 and let it try and correct the issue.

http://linux.die.net/man/8/fsck.jfs

If you have a backup of the data, you are in good shape.
You could try and replace the existing superblock with an alternate created to see if you can salvage something:
http://www.opensubscriber.com/message/jfs-discussion@www-124.ibm.com/549300.html
0
 
LVL 2

Author Comment

by:davepusey
ID: 33449130
No external journaling device.

Good idea with the alternate, but that doesnt explain why the two existing superblocks have moved position by 64k
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 76

Expert Comment

by:arnold
ID: 33449148
The article referenced in the prior has prior comment that use LVM as an overlay over the software RAID.
http://www.gagme.com/greg/linux/raid-lvm.php
0
 
LVL 2

Author Comment

by:davepusey
ID: 33449287
Not using LVM.
0
 
LVL 76

Expert Comment

by:arnold
ID: 33449673
I know, was thinking if you plan on changing you might consider using the LVM as an overlay for the RAID device of the raw devices.

Did you setup the filesystem originally?  maybe an offset was used to set it up?

0
 
LVL 2

Author Comment

by:davepusey
ID: 33449685
>> Did you setup the filesystem originally?

Yes

>> maybe an offset was used to set it up?

Not that i'm aware of.
0
 
LVL 2

Author Comment

by:davepusey
ID: 33459787
My current working theory is that during the rebuild of the array when i had to do a --create because the mdadm superblocks were screwed up, i may have got the disk sequence wrong, and it is reading the chunks in the wrong order.

This won't have affected the parity calculations, so the data is intact, just the chunks are out of sequence.

As we already know which disk has the JFS superblocks (the second one in the current order) this is obviously the correct first disk. The other 4 will have to be done by trial/error. Fortunatly only 24 permutations.
0
 
LVL 76

Expert Comment

by:arnold
ID: 33460393
What do you mean you did --create? Create is for a new array
You needed to use the -A --assemble to group previously configured drives into an array.
http://linux.die.net/man/8/mdadm

I think that is the issue, the --create grouped the drives into the array, but there is no filesystem on it.  The super block references that you found might be the ones from the old array, but there is no guarantee that you can reassemble the harddrives into the correct order while maintaining the data.

Good luck.
0
 
LVL 2

Accepted Solution

by:
davepusey earned 0 total points
ID: 33463312
mdadm --create will recreate the md superblocks from scratch, but is smart enough to detect components of an existing array.

I had to do this because the md superblocks became screwed with many faulty/spare entries that couldn't be removed any other way.

when i did --create i assumed that mdadm would detect the correct drive order. it didn't and used the order i gave (alphabetical)

I've now rerun --create with /dev/sdc3 (the second disk) first. The jfs superblocks are now in the correct position and fsck.jfs -n will now run.

I'm still not certain of the order of the other four disk, but with only 24 permuations it won't take too long to figure it out.
0
 
LVL 2

Author Comment

by:davepusey
ID: 33463451
Woooooohooooooooooo!

Just ran fsck.jfs -n and it reported clean with no errors.

Mounted read-only and all files appear to be intact. Played an 11GB mpeg2 video and no corruption at all.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Problem description :  Some external hard disks / USB flash drives do not show actual space as mentioned in the factory settings. This is a common problem when you use an 8 GB USB drive to make it bootable to install a firmware/ driver on a serv…
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now