Solved

script for checking md5 checksums

Posted on 2011-03-01
9
792 Views
Last Modified: 2012-05-11
Hello,

I have a requirement to write a perl script for checking the md5sum checksums between 2 filesystem. The script should take Source and Destination folder as the command-line arguments and upon execution it should report if the checksum values of files in Source folder is equivalent to files in Destination folder. And if the files in source folder isn't present in Destination, it should report that error also. I came up with a logic and written a perl code but for some reason, it isn't working. Can you look at it and tell me where am going wrong or give me a new code.

It report this error:

Syntax: # do_checksum.pl  <SOURCE folder>  <DESTINATION FOLDER>

#perl do_checksum.pl /smbnas/oralsb40/oralsb11 /smbnas/oralsb40/oralsb11_restore
awk: cmd. line:1: fatal: file `/smbnas/oralsb40/oralsb11_restore' is a directory



EXAMPLES OF MD5CHECKSUM OUTPUT:

[root@]# md5sum nassync.sh
cbc234736d28b3841a6013e968bd0706  nassync.sh
[root@]# md5sum nassync.sh | awk -F" " '{print $1}'
cbc234736d28b3841a6013e968bd0706
[root@oralsb11-new opt]#

cat do_checksum.pl
#!/usr/bin/perl
# Description: This script is for checking the Checksum values between files in 2 different filesystems


my $srcdir = $ARGV[1];
my $destdir = $ARGV[2];

system("ls -l | awk \-F\" \" \'\{print \$9\}\' $srcdir > SRCFILE");
system("ls -l | awk \-F\" \" \'\{print \$9\}\' $destdir > DESTFILE");

system("sort SRCFILE > SRCFILE_SORTED");
system("sort DESTFILE > DESTFILE_SORTED");

my @srcfile = `cat SRCFILE_SORTED`;
my @destfile = `cat DESTFILE_SORTED`;
print "@srcfile\n";

foreach $i(@srcfile)
{
 chomp($srcfile[$i]);
 chomp($destfile[$i]);
 if ( "$srcfile[$i]" eq "$destfile[$i]" )
 {
        my $md5src = `md5sum $srcfile[$i]| awk \-F\" \" \'\{print \$1\}\'`;
        my $md5dest = `md5sum $destfile[$i]| awk \-F\" \" \'\{print $1\}\'`;

        if ( "$md5src" eq "$md5dest" )
        { print "MD5 value for File: $i is same on Source and destination\n"; }
        else
        { print "MD5 value for File: $i is differs with Source and destination\n"; }
 }
else
 {
        print "Source file $srcfile[$i] is not present in destination folder\n";
 }

}

Open in new window

0
Comment
Question by:ashsysad
  • 5
  • 2
  • 2
9 Comments
 
LVL 12

Expert Comment

by:mccracky
Comment Utility
Does it have to be a perl script you write?  rsync basically already does that.  You can use the -c option to only do the comparison on the checksum and you can use the "dry run" option to not actually transfer files, but it should do what you need.
0
 

Author Comment

by:ashsysad
Comment Utility
I'm fine with any solution. Please let me know how to do it. But since I started working on Perl, it guess it would be better to complete it.
0
 
LVL 76

Expert Comment

by:arnold
Comment Utility
Do the two location have the identical structure such that filea in src if exists will be in destination?

The problem is your use of foreach $i (@srcfile)
$i is set to the value versus the index
@srcfile=qw("a" "b" "c");
you are treating $i as though it will have values reperesening index 0,1,2 from the example above, but actually the foreach that you are using actualy returns "a","b","c" for $i
you could use a while loop
$i=0;
while ($i<=$#srcfile)
0
 

Author Comment

by:ashsysad
Comment Utility
@Arnold, I still facing some problem. Please check my attached code.

Script result upon execution:
Please note in the while loop source and destination files aren't captured.

# ls -l test1 test2
test1:
total 0
-rw-r--r-- 1 root root 0 Mar  1 15:05 1
-rw-r--r-- 1 root root 0 Mar  1 15:05 2
-rw-r--r-- 1 root root 0 Mar  1 15:05 a
-rw-r--r-- 1 root root 0 Mar  1 15:05 b
-rw-r--r-- 1 root root 0 Mar  1 15:05 c
-rw-r--r-- 1 root root 0 Mar  1 15:05 d
-rw-r--r-- 1 root root 0 Mar  1 15:05 e
-rw-r--r-- 1 root root 0 Mar  1 15:05 f
-rw-r--r-- 1 root root 0 Mar  1 15:05 g

test2:
total 12
-rw-r--r-- 1 root root   0 Mar  1 15:05 1
-rw-r--r-- 1 root root   0 Mar  1 15:05 4
-rw-r--r-- 1 root root   0 Mar  1 15:05 a
-rw-r--r-- 1 root root   0 Mar  1 15:05 b
-rw-r--r-- 1 root root   0 Mar  1 15:05 d
-rw-r--r-- 1 root root 307 Mar  1 14:56 DESTFILE
-rw-r--r-- 1 root root 307 Mar  1 14:56 DESTFILE_SORTED
-rw-r--r-- 1 root root   0 Mar  1 15:05 e
-rw-r--r-- 1 root root   0 Mar  1 15:05 n
-rw------- 1 root root 211 Mar  1 14:56 nassync_ihot.log

# perl do_checksum.pl test1 test2

 1
 2
 a
 b
 c
 d
 e
 f
 g

 1
 4
 a
 b
 d
 DESTFILE
 DESTFILE_SORTED
 e
 n
 nassync_ihot.log
Source file is
Destination file is

#!/usr/bin/perl
# Description: This script is for checking the Checksum values between files in 2 different filesystems


my $srcdir = $ARGV[0];
my $destdir = $ARGV[1];

system("ls -l $srcdir | awk \-F\" \" \'\{print \$9\}\' > SRCFILE");
system("ls -l $destdir | awk \-F\" \" \'\{print \$9\}\' > DESTFILE");

system("sort SRCFILE > SRCFILE_SORTED");
system("sort DESTFILE > DESTFILE_SORTED");

my @srcfile = `cat SRCFILE_SORTED`;
my @destfile = `cat DESTFILE_SORTED`;
print "@srcfile";
print "@destfile";

my $j=0;
my $i=0;
my $k=0;

while($k<=$#srcfile)
{
 chomp($srcfile[$i]);
 chomp($destfile[$j]);
 print "Source file is $srcfile[$i]\n";
 print "Destination file is $destfile[$j] \n";
 if ( "$srcfile[$i]" eq "$destfile[$j]" )
 {
        my $md5src = `md5sum $srcfile[$i]| awk \-F\" \" \'\{print \$1\}\'`;
        my $md5dest = `md5sum $destfile[$j]| awk \-F\" \" \'\{print $1\}\'`;

        if ( "$md5src" eq "$md5dest" )
        { print "MD5 value for File: $srcfile[$i] is same on Source and destination\n"; }
        else
        { print "MD5 value for File: $srcfile[$i] is differs with Source and destination\n"; }
 }
else
 {
        print "Source file $srcfile[$i] is not present in destination folder\n";
 }
$j = $j + 1;
}

Open in new window

0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 

Author Comment

by:ashsysad
Comment Utility
@mccracky, You told about a solution using rsync. Could you please brief me about it ?
0
 
LVL 12

Accepted Solution

by:
mccracky earned 500 total points
Comment Utility
Depends what you want to do.

I see two solutions.  

One you can simply do a bash script.  The basic steps would be:

1. cd $srcdir
2. find ./ -type f -exec md5sum {} \; > /tmp/srcsums.txt
3. cd $dstdir
4. md5sum -c < /tmp/srcsums.txt | grep -v " OK$" > /tmp/sums_not_equal.txt

With rsync it would be basically the output of (assuming again that you want the whole tree):

rsync -rcnv $srcdir/ $destdir



0
 
LVL 76

Expert Comment

by:arnold
Comment Utility
First you should not store the source/destination file list in the same folder where you are collecting the data.

All the files you have in the example suitable for comparison are all zero length.

There is a module Digest::MD5 as well as a perl that performs MD5sum equivalent transaction
cksum is an alternative.

Are you familiar with hashes?

This is incomplete and untested as I have to run.
See if it helps you.
open (SRCFILE, "ls $sourcedir| ") || die "Unable to list contents of $sourcedir: $!\n";
my %sourcefilehash;
while (<SRCFILE> ) {
chomp();
$sourcefilehash{$_}=`cat $_ |md5sump`;
}
close (SRCFILE);
open (DSTFILE, "ls $destinationdir| ") || die "Unable to list contents of $destinationdir: $!\n";
my %destinationfilehash;
while (<DSTFILE> ) {
chomp();
$destinationfilehash{$_}=`cat $_ |md5sum`;
}
close (DSTFILE);

foreach $filename (sort keys %sourcefile) {
          if ( exists $destinationfilehash{"$filename"} ) { #check if destination has this file
                     if ( $sourcefilehash{"$filename'} == $destinationfilehash{"$filename"} ) { #compare the md5sum results for the file
                             #do what you need the two md5sums are the same
                     }
                      else {
                             #do what you need as the md5sum results do not match
                             }
            else {
                      # the file does not exist at the destination
            }
}

Open in new window

0
 

Author Closing Comment

by:ashsysad
Comment Utility
The solution given by you looks simple and straight-forward and it worked for me. Thanks a lot !!
0
 

Author Comment

by:ashsysad
Comment Utility
@Arnold, Thanks for your time to help me.
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension (http://www.experts-exchange.com/discussions/210281/Attachments-with-no-extension.html). This reminded me of questions tha…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now