bash script - Part 1a - mod to compare (diff) files in different folders

Ref: http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_28489050.html
From this previous question, I got the following code which generally works well. When I run it on a BASE and TEST folder having one file each, where the two file sizes are different, I get no results. I believe this has the same problem with say, two files each, but where no files have the same size. Could you tweak this to handle this case?

There is a message:
grep: same1.txt: No such file or directory
grep: same2.txt: No such file or directory
# set paths
BASE=/path/to/base
TEST=/path/to/test

# get 2 file lists (size name)
ls -lS $BASE | awk '{print $5 " " $9}' > base.txt
ls -lS $TEST | awk '{print $5 " " $9}' > test.txt

# loop through BASE, find same file sizes in TEST
cat base.txt | while read line
do
  s1=$(echo $line | awk '{print $1}')
  if grep -q $s1 test.txt
  then
    echo $line >> same1.txt
    grep $s1 test.txt >> same2.txt
  fi;
done;

# create files with different sizes
grep -v -f same1.txt base.txt > diff1.txt
grep -v -f same2.txt test.txt > diff2.txt

# do the diffs
echo "Diffing files with same size"
paste same1.txt same2.txt | while read line
do
s1=$(echo $line | awk '{print $2}')
s2=$(echo $line | awk '{print $4}')
diff $s1 $s2
done;

echo "Diffing files with differerent size"
paste diff1.txt diff2.txt | while read line
do
s1=$(echo $line | awk '{print $2}')
s2=$(echo $line | awk '{print $4}')
diff $s1 $s2
done;

Open in new window

LVL 33
phoffricAsked:
Who is Participating?
 
Gerwin Jansen, EE MVETopic Advisor Commented:
If we check whether same1.txt exist before we create the diff files then you don't get the 2 grep errors:

# check if any files with same size
if [ -s same1.txt ]
then
      # create files with different sizes
      grep -v -f same1.txt base.txt > diff1.txt
      grep -v -f same2.txt test.txt > diff2.txt

      # do the diffs
      echo "Diffing files with same size"
      paste same1.txt same2.txt | while read line
      do
            s1=$(echo $line | awk '{print $2}')
            s2=$(echo $line | awk '{print $4}')
            diff $s1 $s2
      done;
fi

And if we don't have a 'same1.txt' (or same2.txt) then we just compare base and test:

# check which files contain different sized files
if [ ! -s same1.txt ]
then
      f1=base.txt
      f2=test.txt
else
      f1=diff1.txt
      f2=diff2.txt
fi

echo "Diffing files with differerent size"
paste $f1 $f2 | while read line
do
      s1=$(echo $line | awk '{print $2}')
      s2=$(echo $line | awk '{print $4}')
      diff $s1 $s2
done;
0
 
Duncan RoeSoftware DeveloperCommented:
Perhaps you need to re-jig the script so as to do something more general than work from 2 lists. I don't have time to code anything right now but my approach would be:
Have 2 lists of files, sorted by size (as now)
Work through files in one of the lists individually
If there's an equal size file in 2nd list, compare against it and you are done
OTHERWISE
locate next-smaller file and count # lines in diff
locate next-larger file and count # lines in diff
if the above 2 steps only find one file (e.g. no smaller file), report comparison against that file
Otherwise report the smaller diff (or maybe both, depending ...)
0
 
phoffricAuthor Commented:
I reviewed your code, and I get the gist of it. I will try to implement it in the next week. Thanks again.
0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 
phoffricAuthor Commented:
@Duncan Roe,
Sorry about the title confusion. The titles now put in the Part number early for easier visibility. I believe your comment belongs in Par 2:
   http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_28490563.html

This Part 1a was intended to be just a relatively easier update to the original question to handle the specific case where all the file sizes in BASE and TEST were different.
0
 
Gerwin Jansen, EE MVETopic Advisor Commented:
>> Thanks again.
No problem. If you can post some (redacted) samples next week, we can do some testing for you.
0
 
Duncan RoeSoftware DeveloperCommented:
OK - re-posted. Doesn't look so pretty though :-/
Phoffric, could you possibly tar up sample TEST & BASE directories and post as a file attachment? (assuming they're not confidential).
Thanks ... Duncan.
0
 
phoffricAuthor Commented:
Yeah, unfortunately, I am not allowed to present actual files. I would have to generate by hand some look-alikes, which I will be more than pleased to do.
0
 
Duncan RoeSoftware DeveloperCommented:
If you would be so kind as to spend the time to do so, that would be terrific. You could make them differ in a way that mirrors how your production files do, which the rest of us can only guess at.
0
 
phoffricAuthor Commented:
Worked this weekend so no time. Will have free time soon.
0
 
Gerwin Jansen, EE MVETopic Advisor Commented:
No problem. Just open a new question when needed.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.