valleytech
asked on
EXT-3 maximal mount count reached (centOS)
Hi
I have 1 Centos kernel 2.6 running web-service-hosting cPanel software.
It's been running pretty well for the last couple of months. However, recent, I received notes that filesystem sunddenly is locked down (and become read-only file system). Consequently, all file-upload, website-session-control, etc are not working. Unless I reboot the server then filesystem becomes normal again.
But, eventually, it just enters read-only mode.
is there anyway that can fix this?
I force filesystem to check itself at boot. I also performed fsck remtely. THought the result turn out to be clean, it sitll goes back to "read-only" eventually.
Any inight will be great!!
I have 1 Centos kernel 2.6 running web-service-hosting cPanel software.
It's been running pretty well for the last couple of months. However, recent, I received notes that filesystem sunddenly is locked down (and become read-only file system). Consequently, all file-upload, website-session-control, etc are not working. Unless I reboot the server then filesystem becomes normal again.
But, eventually, it just enters read-only mode.
is there anyway that can fix this?
I force filesystem to check itself at boot. I also performed fsck remtely. THought the result turn out to be clean, it sitll goes back to "read-only" eventually.
Any inight will be great!!
I check /var/log/dmesg and receive this:
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on dm-0, internal journal
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev sda1, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
kjournald starting. Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on sdc1, internal journal
ext3_orphan_cleanup: deleting unreferenced inode 2244609
EXT3-fs: sdc1: 1 orphan inode deleted
---------------------
mount table
root@cpanel [/var/log]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw,usrquota)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sdc1 on /home type ext3 (rw,usrquota)
/dev/sdb1 on /home2 type ext3 (rw,usrquota)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/usr/tmpDSK on /tmp type ext3 (rw,noexec,nosuid,loop=/dev/loop0)
/tmp on /var/tmp type none (rw,noexec,nosuid,bind)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
From the bottom of your log;
EXT3 FS on sdc1, internal journal
ext3_orphan_cleanup: deleting unreferenced inode 2244609
EXT3-fs: sdc1: 1 orphan inode deleted
ASKER
ah! I totaly didn't see that part, i was focusing on /dev/sda1
I'm running fsck /dev/sdc1 -y
Along the way, I received a list of
Multiply-claimed block(s) in inode 459888: 943551 943552 943553 943554 943555 94 3556 943557
Illegal block number passed to ext2fs_test_block_bitmap #33554432 for multiply c laimed block map
It's been staying there for 15 minutes already.
soulnd like a bad sign? a bad disk in this case?
many thanks!
I'm running fsck /dev/sdc1 -y
Along the way, I received a list of
Multiply-claimed block(s) in inode 459888: 943551 943552 943553 943554 943555 94 3556 943557
Illegal block number passed to ext2fs_test_block_bitmap #33554432 for multiply c laimed block map
It's been staying there for 15 minutes already.
soulnd like a bad sign? a bad disk in this case?
many thanks!
Could be on it's way out... possibly one leg out the door if you know what I mean...
If you havnt got a backup you should get one ASAP.
If you have smartctl installed, run the smartctl command I posted above
just done a search on the error, found this; http://www.linuxquestions.org/questions/linux-newbie-8/filesystem-errors-617286/
If you havnt got a backup you should get one ASAP.
If you have smartctl installed, run the smartctl command I posted above
just done a search on the error, found this; http://www.linuxquestions.org/questions/linux-newbie-8/filesystem-errors-617286/
ASKER
Agree!
luckily I have a NFS partition to copy stuffs over.
I also run smarctl -t long /dev/sdc
luckily I have a NFS partition to copy stuffs over.
I also run smarctl -t long /dev/sdc
root@cpanel [~]# smartctl -a /dev/sdc
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: IBM IC35L073UCDY10-0 Version: S27F
Serial number: E6VSWWBC
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Mon Nov 16 15:46:28 2009 PST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 35 C
Drive Trip Temperature: 85 C
Manufactured in week 39 of year 2003
Recommended maximum start stop count: 10000 times
Current start stop count: 199 times
Elements in grown defect list: 0
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 0 0 0 0 17608.736 0
write: 0 0 0 9 9 12313.029 0
verify: 0 0 0 0 0 0.002 0
Non-medium error count: 0
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background long Self test in progress ... - 41899 - [- - -]
# 2 Background long Completed - 0 - [- - -]
# 3 Background short Completed - 0 - [- - -]
Long (extended) Self Test duration: 4700 seconds [78.3 minutes]
OK, I was hoping to see a bit more information (see example below) but I guess not all disk report the same info.
Anyway, I guess youre now looking at the right disk, once you've copied all the info off you could delete the partition table and re-partition/reformat the drive to see if it clears the problem, If it were me, I'd be ordering a replacement.
Anyway, I guess youre now looking at the right disk, once you've copied all the info off you could delete the partition table and re-partition/reformat the drive to see if it clears the problem, If it were me, I'd be ordering a replacement.
SMART Attributes Data Structure revision number: 32
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0027 192 191 063 Pre-fail Always - 12358
4 Start_Stop_Count 0x0032 253 253 000 Old_age Always - 51
5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0
8 Seek_Time_Performance 0x0027 245 219 187 Pre-fail Always - 46238
9 Power_On_Hours 0x0032 178 178 000 Old_age Always - 26238
10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0
11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 253 253 000 Old_age Always - 82
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 052 044 000 Old_age Always - 48 (Lifetime Min/Max 21/49)
192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0
194 Temperature_Celsius 0x0032 043 253 000 Old_age Always - 48
195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 2936
196 Reallocated_Event_Count 0x0008 253 253 000 Old_age Offline - 0
197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline - 0
198 Offline_Uncorrectable 0x0008 253 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0
200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0
201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 1
202 TA_Increase_Count 0x000a 253 252 000 Old_age Always - 0
203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 0
204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age Always - 0
205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always - 0
207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0
208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0
210 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0
211 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0
212 Unknown_Attribute 0x0032 253 252 000 Old_age Always - 0
ASKER
what command you're using to give such information?
smartctl -t long?
thanks!
smartctl -t long?
thanks!
in your case; smartctl -a /dev/sdc
At least you can see your disk is 6 years old... :-)
I'm afraid I have to go to my pit (it's a bit late). I hope you manage to backup the partition, I'll check back again later but I'm sure someone else will pick up the Q while I'm offline.
ASKER
many thanks!!!
ASKER
i ended up replacing the disk ;)
thanks again!
thanks again!
ASKER
i thought /dev/sda1 is problematic