How can I monitor file integrity of thousands of digital library resources?

I work the UC Berkeley in the Library Systems Office.  Our main purpose is to host digital collections and provide end users with the tools to create them and the storage to place the files.  After 20 years of host these types of projects and several server changes and file moves we have come to the realization that we need to be able to monitor all of these files to make sure that they don't become corrupt and if they do we need to know so we can replace the corrupted file with a previous version.

Does anyone know of an open-source solution to this problem?  Any and all help is appreciated!
rump0054Asked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
HapexamendiosConnect With a Mentor Commented:
Hi rump0054,

I sense Tripwire comes up because whilst what you're essentially looking at is File Integrity Monitoring for a support/service reason, whereas it's more commonly undertaken for security reasons.

Tripwire would do, as would OSSEC (a Host-based Intrusion Detection System), in that they contin the existing logic for performing checks against MD5 or SHA-1 checksums. Whilst they might ne OTT for your initial needs, consider that you can disable the functionality you don't need, leaving you with just FIM - and you (hopefully) know where all your content is, which is teh task most people find so difficult in setting up FIM for security reasons.

We elected to go for a commercial product for our needs, called LogRhythm - our need was security, and LR ticked a lot of the logging and other requirements we had - but for your case I'd say one of these might be a good bet.

Best of luck whichever way you go.
0
 
Dave BaldwinFixer of ProblemsCommented:
'md5sum' http://en.wikipedia.org/wiki/Md5sum is frequently used and as the article notes, it is part of many operating systems.  The MD5 sum is often found on web sites for files to be downloladed so you can verify your downloads.  There is also sha1sum http://en.wikipedia.org/wiki/Sha1sum .
0
 
rump0054Author Commented:
I did do a application planning phase for this project and I was planning on using MD5 so it's good to hear that others agree that MD5 is appropriate for this purpose.

My main request is if anyone knows of any open source solution, something like Tripwire for example, so I wouldn't have to actually create the application myself.

BTW: The reason I list Tripwire but planning on using it is that it seems to be overkill for what my simple needs are but if someone has experience with it that shows otherwise I could just give that another try as well...
0
 
HapexamendiosCommented:
Hi rump0054,

I sense Tripwire comes up because whilst what you're essentially looking at is File Integrity Monitoring for a support/service reason, whereas it's more commonly undertaken for security reasons.

Tripwire would do, as would OSSEC (a Host-based Intrusion Detection System), in that they contin the existing logic for performing checks against MD5 or SHA-1 checksums. Whilst they might ne OTT for your initial needs, consider that you can disable the functionality you don't need, leaving you with just FIM - and you (hopefully) know where all your content is, which is teh task most people find so difficult in setting up FIM for security reasons.

We elected to go for a commercial product for our needs, called LogRhythm - our need was security, and LR ticked a lot of the logging and other requirements we had - but for your case I'd say one of these might be a good bet.

Best of luck whichever way you go.
0
 
rump0054Author Commented:
I actually had stumbled across OSSEC in my research and it sounded like it would work.  Good to hear it from another source.  I'll take a look at it again.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.