Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


Proactive monitoring of ext3 file system

Posted on 2007-11-24
Medium Priority
Last Modified: 2008-02-01
I am admnistering an RHEL4 based linux cluster having 100 nodes and about a dozen NFS filservers (also RHEL4 boxes) with a total capacity of a few tens of TB. As usual, a few blocks keep going bad every now and then. What do I need to proactively monitor the storage system so that I could try recovery before it is too late and advise the right user, identifying files that might have been affected?
Question by:vinod
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
LVL 43

Expert Comment

ID: 20343738
> so that I could try recovery before it is too
What You mean - predicting the failure? Some "predictions" can be gathered from
smartctl -A /dev/sda

Author Comment

ID: 20345022
Our experience with SMART monitoring has not been very good. I noticed significant increase in file system corruptions while SMART service was enabled and we had to turn it off about a year ago. I could never understand how only a "monitoring" service could cause corruption. Even if it works fine, I believe SMART can give you only statistics of error counters which does not help in identifying the affected file(s).

My concern is slightly different. Disk blocks may go bad randomly. How will I find it unless user uses the file or I run fsck, both of which may not happen for a long time? Even at that time, the block may or may not be in correctable state. Is there something that I can do frequently, say every night, to check the file system consistency without umounting it and identifying files, not just bad blocks?
LVL 43

Accepted Solution

ravenpl earned 500 total points
ID: 20345089
You can run badblocks(RO mode) to get the list of bads - You already know.
No, there is no way(really) to match block to file (no such data in filesystem).
So what You can, is to get blocklist for given file.
So basically You can do it for each file/dir and check if one block is within bads set.
Unfortunatelly there is no tool for printing block list for given file.
filefrag -v # gives some information
is really for You, just modify the printed result
< $result .= $_ ? "*" : ".";
> $result .= "$_ ";
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

LVL 43

Expert Comment

ID: 20345671

Author Comment

ID: 20363287
fileblocks does not compile on my RHEL4 machine:

# gcc -o fileblocks fileblocks.c
/tmp/ccXzML8K.o(.text+0x121): In function `printOneFile':
: undefined reference to `_IO'
collect2: ld returned 1 exit status
LVL 43

Expert Comment

ID: 20366184
my bad, download again.

Author Comment

ID: 20496578
I think I will give my points to ravenpl for sending me useful utility although his solution won't be very practical for me. I was hoping that if a file gets corrupted due to a sector going bad randomly then OS should be able to catch it and I should get an alter rather than me skimming tens of tera byte myself. It seems the only way to do it will be to enable SMART which had its own problem when I tried it more than a year ago.
LVL 43

Expert Comment

ID: 20498148
> sector going bad randomly then OS should be able to catch it
But when OS have the chance spotting it? Only if the file is read. Reading block is not enough, cause there's no map from block to file. There's only the reverse map. At least for ext3.
Maybe one could investigate other filesystems internals (jfs, xfs, reiser*, zfs, etc.) - maybe they'r smarter than that.

Expert Comment

ID: 20545852
Forced accept.

EE Admin

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The business world is becoming increasingly integrated with tech. It’s not just for a select few anymore — but what about if you have a small business? It may be easier than you think to integrate technology into your small business, and it’s likely…
Compliance and data security require steps be taken to prevent unauthorized users from copying data.  Here's one method to prevent data theft via USB drives (and writable optical media).
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question