Link to home
Start Free TrialLog in
Avatar of valleytech
valleytechFlag for United States of America

asked on

EXT-3 maximal mount count reached (centOS)

Hi
I have 1 Centos kernel 2.6 running web-service-hosting cPanel software.
It's been running pretty well for the last couple of months. However, recent, I received notes that filesystem sunddenly is locked down (and become read-only file system). Consequently, all file-upload, website-session-control, etc are not working. Unless I reboot the server then filesystem becomes normal again.

But, eventually, it just enters read-only mode.
is there anyway that can fix this?
I force filesystem to check itself at boot. I also performed fsck remtely. THought the result turn out to be clean, it sitll goes back to "read-only" eventually.

Any inight will be great!!



I check /var/log/dmesg and receive this:
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on dm-0, internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev sda1, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on sdc1, internal journal
ext3_orphan_cleanup: deleting unreferenced inode 2244609
EXT3-fs: sdc1: 1 orphan inode deleted
 
 
---------------------
mount table
 
root@cpanel [/var/log]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw,usrquota)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sdc1 on /home type ext3 (rw,usrquota)
/dev/sdb1 on /home2 type ext3 (rw,usrquota)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/usr/tmpDSK on /tmp type ext3 (rw,noexec,nosuid,loop=/dev/loop0)
/tmp on /var/tmp type none (rw,noexec,nosuid,bind)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Julian Parker
Julian Parker
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of valleytech

ASKER

owah... how can you tell it's /dev/sdc1 causing problem?
i thought /dev/sda1 is problematic
From the bottom of your log;
EXT3 FS on sdc1, internal journal
ext3_orphan_cleanup: deleting unreferenced inode 2244609
EXT3-fs: sdc1: 1 orphan inode deleted

Open in new window

ah! I totaly didn't see that part, i was focusing on /dev/sda1
I'm running fsck /dev/sdc1 -y
Along the way, I received a list of
Multiply-claimed block(s) in inode 459888: 943551 943552 943553 943554 943555 94                                                                             3556 943557
Illegal block number passed to ext2fs_test_block_bitmap #33554432 for multiply c                                                                             laimed block map

It's been staying there for 15 minutes already.
soulnd like a bad sign? a bad disk in this case?

many thanks!
Could be on it's way out... possibly one leg out the door if you know what I mean...

If you havnt got a backup you should get one ASAP.

If you have smartctl installed, run the smartctl command I posted above

just done a search on the error, found this; http://www.linuxquestions.org/questions/linux-newbie-8/filesystem-errors-617286/
Agree!
luckily I have a NFS partition to copy stuffs over.
I also run smarctl -t long /dev/sdc


root@cpanel [~]# smartctl -a /dev/sdc
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
 
Device: IBM      IC35L073UCDY10-0 Version: S27F
Serial number: E6VSWWBC
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Mon Nov 16 15:46:28 2009 PST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
 
Current Drive Temperature:     35 C
Drive Trip Temperature:        85 C
Manufactured in week 39 of year 2003
Recommended maximum start stop count:  10000 times
Current start stop count:      199 times
Elements in grown defect list: 0
 
Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0      17608.736           0
write:         0        0         0         9          9      12313.029           0
verify:        0        0         0         0          0          0.002           0
 
Non-medium error count:        0
 
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Self test in progress ...   -   41899                 - [-   -    -]
# 2  Background long   Completed                   -       0                 - [-   -    -]
# 3  Background short  Completed                   -       0                 - [-   -    -]
 
Long (extended) Self Test duration: 4700 seconds [78.3 minutes]

Open in new window

OK, I was hoping to see a bit more information (see example below) but I guess not all disk report the same info.

Anyway, I guess youre now looking at the right disk, once you've copied all the info off you could delete the partition table and re-partition/reformat the drive to see if it clears the problem, If it were me, I'd be ordering a replacement.
SMART Attributes Data Structure revision number: 32
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time            0x0027   192   191   063    Pre-fail  Always       -       12358
  4 Start_Stop_Count        0x0032   253   253   000    Old_age   Always       -       51
  5 Reallocated_Sector_Ct   0x0033   253   253   063    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   253   252   000    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0027   245   219   187    Pre-fail  Always       -       46238
  9 Power_On_Hours          0x0032   178   178   000    Old_age   Always       -       26238
 10 Spin_Retry_Count        0x002b   253   252   157    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x002b   253   252   223    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   253   253   000    Old_age   Always       -       82
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   052   044   000    Old_age   Always       -       48 (Lifetime Min/Max 21/49)
192 Power-Off_Retract_Count 0x0032   253   253   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   253   253   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0032   043   253   000    Old_age   Always       -       48
195 Hardware_ECC_Recovered  0x000a   253   252   000    Old_age   Always       -       2936
196 Reallocated_Event_Count 0x0008   253   253   000    Old_age   Offline      -       0
197 Current_Pending_Sector  0x0008   253   253   000    Old_age   Offline      -       0
198 Offline_Uncorrectable   0x0008   253   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0008   199   199   000    Old_age   Offline      -       0
200 Multi_Zone_Error_Rate   0x000a   253   252   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   253   252   000    Old_age   Always       -       1
202 TA_Increase_Count       0x000a   253   252   000    Old_age   Always       -       0
203 Run_Out_Cancel          0x000b   253   252   180    Pre-fail  Always       -       0
204 Shock_Count_Write_Opern 0x000a   253   252   000    Old_age   Always       -       0
205 Shock_Rate_Write_Opern  0x000a   253   252   000    Old_age   Always       -       0
207 Spin_High_Current       0x002a   253   252   000    Old_age   Always       -       0
208 Spin_Buzz               0x002a   253   252   000    Old_age   Always       -       0
210 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0
211 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0
212 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0

Open in new window

what command you're using to give such information?
smartctl -t long?

thanks!
in your case; smartctl -a /dev/sdc
At least you can see your disk is 6 years old... :-)
I'm afraid I have to go to my pit (it's a bit late). I hope you manage to backup the partition, I'll check back again later but I'm sure someone else will pick up the Q while I'm offline.
many thanks!!!
i ended up replacing the disk ;)
thanks again!