Link to home
Create AccountLog in
Avatar of ryan80
ryan80

asked on

EXT3 file system acting read only, listed as read write

hello,

I have a server that has a Fibre Channel mount that is having issues. When I can in this morning I found that it had switched to read only. Now looking through /var/log/messages I see that there are errors reported on the disk. I know that i will need to run fsck on it, but I do see lines about the device mapper failing for mpath0 first, which is the device which is mounted.

I am wondering if someon can take a look and see if this is just a drive issue, or an issue with the fibre channel connection.



Mar 28 00:12:36 server kernel: sd 4:0:0:0: [sdb] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK,SUGGEST_OK
Mar 28 00:12:36 server kernel: end_request: I/O error, dev sdb, sector 314570452
Mar 28 00:12:36 server kernel: device-mapper: multipath: Failing path 8:16.
Mar 28 00:12:36 server multipathd: 8:16: mark as failed
Mar 28 00:12:36 server multipathd: mpath0: remaining active paths: 0
Mar 28 00:12:36 server kernel: sd 4:0:0:0: [sdb] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK,SUGGEST_OK
Mar 28 00:12:36 server kernel: end_request: I/O error, dev sdb, sector 314571476
Mar 28 00:12:36 server kernel: __ratelimit: 5060 callbacks suppressed
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 2687
Mar 28 00:12:36 server kernel: lost page write due to I/O error on dm-4
Mar 28 00:12:36 server kernel: Aborting journal on device dm-4.
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 1545
Mar 28 00:12:36 server kernel: lost page write due to I/O error on dm-4
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4) in ext3_reserve_inode_write: Journal has aborted
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 0
Mar 28 00:12:36 server kernel: lost page write due to I/O error on dm-4
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4) in ext3_dirty_inode: Journal has aborted
Mar 28 00:12:36 server kernel: ------------[ cut here ]------------
Mar 28 00:12:36 server kernel: WARNING: at fs/buffer.c:1186 mark_buffer_dirty+0x2f/0x87() (Not tainted)
Mar 28 00:12:36 server kernel: Hardware name: IBM eServer BladeCenter HS20 -[884345Y]-
Mar 28 00:12:36 server kernel: Modules linked in: nls_utf8 cifs autofs4 nfsd lockd nfs_acl auth_rpcgss exportfs bridge stp bnep rfcomm l2cap bluetooth sunrp
c ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables dm_round_robin loop dm_multipath scsi_dh radeon drm ipv6
qla2xxx scsi_transport_fc pcspkr scsi_tgt sg tg3 ata_piix pata_acpi ata_generic libphy i2c_i801 i2c_core libata iTCO_wdt iTCO_vendor_support i6300esb e752x_e
dac edac_core dm_snapshot dm_zero dm_mirror dm_log dm_mod shpchp mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_
hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
Mar 28 00:12:36 server kernel: Pid: 26205, comm: rsync Not tainted 2.6.27.25-78.2.56.fc9.x86_64 #1
Mar 28 00:12:36 server kernel:
Mar 28 00:12:36 server kernel: Call Trace:
Mar 28 00:12:36 server kernel: [<ffffffff8103ff10>] warn_on_slowpath+0x80/0xae
Mar 28 00:12:36 server kernel: [<ffffffff81010a37>] ? restore_args+0x0/0x30
Mar 28 00:12:36 server kernel: [<ffffffff81031103>] ? need_resched+0x1e/0x28
Mar 28 00:12:36 server kernel: [<ffffffff81031103>] ? need_resched+0x1e/0x28
Mar 28 00:12:36 server kernel: [<ffffffff810e0e9f>] mark_buffer_dirty+0x2f/0x87
Mar 28 00:12:36 server kernel: [<ffffffffa003d609>] ext3_commit_super+0x50/0x66 [ext3]
Mar 28 00:12:36 server kernel: [<ffffffffa003ee46>] ext3_handle_error+0x86/0xad [ext3]
Mar 28 00:12:36 server kernel: [<ffffffffa003eecd>] __ext3_std_error+0x60/0x68 [ext3]
Mar 28 00:12:36 server kernel: [<ffffffffa004055e>] __ext3_journal_stop+0x3f/0x4a [ext3]
Mar 28 00:12:36 server kernel: [<ffffffffa0036573>] ext3_dirty_inode+0x7b/0x83 [ext3]
Mar 28 00:12:36 server kernel: [<ffffffff810dc8f7>] __mark_inode_dirty+0x33/0x190
Mar 28 00:12:36 server kernel: [<ffffffff810caa56>] ? filldir+0x0/0xc5
Mar 28 00:12:36 server kernel: [<ffffffff810d1536>] touch_atime+0x112/0x11d
Mar 28 00:12:36 server kernel: [<ffffffff810cac56>] vfs_readdir+0x92/0xaf
Mar 28 00:12:36 server kernel: [<ffffffff810cadb2>] sys_getdents+0x7d/0xc4
Mar 28 00:12:36 server kernel: [<ffffffff8101027a>] system_call_fastpath+0x16/0x1b
Mar 28 00:12:36 server kernel:
Mar 28 00:12:36 server kernel: ---[ end trace 3308e6b8921baaf4 ]---
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 0
Mar 28 00:12:36 server kernel: lost page write due to I/O error on dm-4
Mar 28 00:12:36 server kernel: ext3_abort called.
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_journal_start_sb: Detected aborted journal
Mar 28 00:12:36 server kernel: Remounting filesystem read-only
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4137455, block=16547872
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4137454, block=16547872
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4137496, block=16547875
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4137653, block=16547885
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4137632, block=16547883
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4137453, block=16547872
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5570592, block=22282243
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3196159, block=12779601
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3186971, block=12746771
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5005644, block=20021270
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3236933, block=12943430
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3432672, block=13729807
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4939782, block=19759106
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3416096, block=13664259
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3704112, block=14811220
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3704303, block=14811232
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3704290, block=14811232
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3704294, block=14811232
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5595680, block=22380579
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3776578, block=15106054
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5751322, block=23003171
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3342475, block=13369354
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5529606, block=22118402
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3260819, block=13041691
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4947991, block=19791875
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3276997, block=13107214
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3711379, block=14843931
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3212521, block=12845136
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3188793, block=12746885
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5014137, block=20054057
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5529605, block=22118402
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4940651, block=19759160
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3211302, block=12845060
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3711261, block=14843923
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3711226, block=14843921
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3711217, block=14843921
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3188794, block=12746885
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3188800, block=12746885
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3188804, block=12746886
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3180210, block=12714093
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3186972, block=12746771
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3188799, block=12746885
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3342831, block=13369376
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3704798, block=14811263
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3704785, block=14811263
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3704789, block=14811263
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3407982, block=13631496
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5570828, block=22282258
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3776586, block=15106054
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5685253, block=22740994
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3359022, block=13434900
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3236590, block=12943408
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5678935, block=22708343
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=5071545, block=20283437
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3236018, block=12943373
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3318284, block=13271074
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3254336, block=13009029
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3350690, block=13402124
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=4981491, block=19922993
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3276948, block=13107211
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3277578, block=13107250
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3702864, block=14811142
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3778333, block=15106163
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_get_inode_loc: unable to read inode block - inode=3778252, block=15106158
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 1540
Mar 28 00:12:36 server kernel: EXT3-fs error (device dm-4): ext3_readdir: directory #11 contains a hole at offset 0
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 1541
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 1542
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 1543
Mar 28 00:12:36 server kernel: Buffer I/O error on device dm-4, logical block 1544
Mar 28 00:12:45 server kernel: EXT3-fs error (device dm-4): ext3_readdir: directory #11 contains a hole at offset 0
Mar 28 00:12:45 server multipathd: sdb: readsector0 checker reports path is up
Mar 28 00:12:45 server multipathd: 8:16: reinstated
Mar 28 00:12:45 server multipathd: mpath0: remaining active paths: 1

Open in new window

Avatar of ryan80
ryan80

ASKER

#uname -a

Linux server 2.6.27.25-78.2.56.fc9.x86_64 #1 SMP Thu Jun 18 12:24:37 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
ASKER CERTIFIED SOLUTION
Avatar of David
David
Flag of United States of America image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Avatar of ryan80

ASKER

It is pretty unlikely that the SAN was reconfigured, so I am looking to see if there was a failed drive. Right at that time is when a cron job using rsync is set up to make a copy of files to a backup location. I am guessing that this is triggering the access of the failed sectors and causing this issue.

Does this sound reasonable?
Avatar of ryan80

ASKER

I checked and there was no SAN reconfiguration. Also there are no drives that are marked as failed. Can this be caused just by corruption of the file system and it cant get the requested files? Or is this hardware related 100% and the connections for the SAN and HBA should be checked?
Could you please show:

# multipath -ll
Avatar of ryan80

ASKER

multipath -ll
mpath0 (3600a0b80001f6dda00000ac84a819006) dm-2 IBM,1722-600
[size=300G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
 \_ 4:0:0:0 sdb 8:16  [active][ready]
Beginning of dump verifies some unreadable blocks, and this will not cause a hdd failure. However these bad blocks could very well contain important inode info, so the file system must be fsck'd.  If the damage is bad enough, then there is going to be a point where kernel gives up.

I would unmount volume & fsck it to clean up the errors.  
Avatar of ryan80

ASKER

are there any particular options that I should use when running fsck?
Could you please show:

# df -Th

# cat /etc/fstav

# fdisk -l

# pvscan

# cat /etc/fstab
Avatar of ryan80

ASKER

Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
              ext3     66G   24G   39G  38% /
/dev/sda1     ext3    190M   17M  164M  10% /boot
tmpfs        tmpfs    1.5G   52K  1.5G   1% /dev/shm
/dev/mapper/mpath0p1
              ext3    148G   47G   94G  34% /Development
/dev/mapper/mpath0p2
              ext3    148G   31G  110G  22% /Production




/dev/VolGroup00/LogVol00 /                       ext3    defaults        1 1
UUID=b45b2fbd-ae8f-4831-b4a3-43599d4a1839 /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
/dev/VolGroup00/LogVol01 swap                    swap    defaults        0 0
#/dev/mapper/mpath0p1 /Development      ext3    defaults        1 2
#/dev/mapper/mpath0p2 /Production       ext3    defaults        1 2
(the fibre channel drives are listed in ldap, but this represents what they are)





Disk /dev/sda: 73.4 GB, 73406611456 bytes
255 heads, 63 sectors/track, 8924 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000021

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          25      200781   83  Linux
/dev/sda2              26        8924    71481217+  8e  Linux LVM

Disk /dev/dm-0: 71.0 GB, 71068286976 bytes
255 heads, 63 sectors/track, 8640 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-0 doesn't contain a valid partition table

Disk /dev/dm-1: 2080 MB, 2080374784 bytes
255 heads, 63 sectors/track, 252 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x30307800

Disk /dev/dm-1 doesn't contain a valid partition table

Disk /dev/sdb: 322.1 GB, 322122547200 bytes
255 heads, 63 sectors/track, 39162 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000d5dd2

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       19580   157276318+  83  Linux
/dev/sdb2           19581       39162   157292415   83  Linux

Disk /dev/dm-2: 322.1 GB, 322122547200 bytes
255 heads, 63 sectors/track, 39162 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000d5dd2

     Device Boot      Start         End      Blocks   Id  System
/dev/dm-2p1               1       19580   157276318+  83  Linux
/dev/dm-2p2           19581       39162   157292415   83  Linux

Disk /dev/dm-3: 161.0 GB, 161050950144 bytes
255 heads, 63 sectors/track, 19579 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-3 doesn't contain a valid partition table

Disk /dev/dm-4: 161.0 GB, 161067432960 bytes
255 heads, 63 sectors/track, 19582 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-4 doesn't contain a valid partition table

SHow us output of :

# pvscan
Avatar of ryan80

ASKER

PV /dev/sda2   VG VolGroup00   lvm2 [68.16 GB / 32.00 MB free]
  Total: 1 [68.16 GB] / in use: 1 [68.16 GB] / in no VG: 0 [0   ]
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
Avatar of ryan80

ASKER

is e2fsck the same as fsck.ext3 ?
Both commands check/fix a Linux ext2/ext3 file system.
Avatar of ryan80

ASKER

ok, and the -f option will scan the entire drive then, even though it is marked as good?
-f means force to check even if the file system is good.
Avatar of ryan80

ASKER

Thanks,

I will run a full check of the device this weekend. I think this should fix it.