Solved

Need a Shell script to compare files under the directories on AIX box.

Posted on 2012-03-26
14
757 Views
Last Modified: 2012-08-13
Need a Shell script to compare files under the directories on AIX box.



Code contains two directories
Java—which contains various subdirectories containing .java files and
Oracle ---which contains oracle code containing subdirectories containing files of format .sp,.spp,.pdc,.ty,.sql,.sf

The above code is placed in dir1 on aix box /path/to/dir1 (contains no compiled code)
The old code on aix is in  dir2 which has the compiled files along with the .java and compiled oracle code in the subfolders both in java and oracle.

The script should compare the .java files and oracle code(sp,.spp,.pdc,.ty,.sql,.sf) in subdirectories under dir1 to dir2 {which contains the many more java and oracle sub-directories}.

The script should only show the files that are different and the difference in the files. The output can be written to a text file.

Tried using the dircmp command and diff command with all the options did not work..
0
Comment
Question by:raaj4354
  • 6
  • 4
  • 4
14 Comments
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37766570
did you try dirdiff?
---

#!/bin/sh
cd /path/to/dir1/
for e in java sp spp pdc ty sql sf; do
    for f in .$e; do
          echo "=========== $f ">> /tmp/log
          diff $f /path/to/dir2/$f >> /tmp/log
    done
done
0
 

Author Comment

by:raaj4354
ID: 37767447
No i have not tried dirdiff. Not sure if it works in AIX. By dirdiff did u mean the script here?

http://www.brunolinux.com/02-The_Terminal/DirDiff_Script.html

The above script is not working. gives out an error saying ..""diff:specify two file names.""
0
 
LVL 19

Expert Comment

by:simon3270
ID: 37767747
I think you need a "find" in there.  For example:

#!/bin/sh
cd /path/to/dir1/
for e in java sp spp pdc ty sql sf; do
    find . -name "*.$e" | while read f; do
          echo "=========== $f ">> /tmp/log
          diff "$f" /path/to/dir2/"$f" >> /tmp/log
    done
done

Open in new window

0
 

Author Comment

by:raaj4354
ID: 37767881
simon3270: Tried the script it did not throw any error but the output file just had the below contents...... :-(

-----------/path/to/dir1/
-----------.java
0
 
LVL 19

Expert Comment

by:simon3270
ID: 37768077
You need to change "/path/to/dir1" to the actual location of your "dir1"  and "/path/to/dir2" to your actual dir2.
0
 

Author Comment

by:raaj4354
ID: 37768168
I have changed the path to the location where the two directories are present and changed the tmp/log location to point to a text file /path/to/textfile.txt.
Still nothing yet... am i missing something??. Let me know if u need any more information.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 37768216
Please cut and paste your actual script.

Also., go to dir1, and make sure that, for example, the following returns a list of files:
    find . -name "*.java"
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 51

Expert Comment

by:ahoffmann
ID: 37768377
find . -name '*.java' -o -name '*.sp' -o -name '*.spp' -o -name '*.pdc' -o -name '*.ty' -o -name '*.sql' -o -name '*.sf' -print0 |xargs -0 -n 1 diff /path/to/dir2/ >> /tmp/log
0
 
LVL 19

Expert Comment

by:simon3270
ID: 37768502
The one-liner is one way to do it, but it's harder to extend, and you are missing a couple of brackets and arguments.  Also you don't get a chance to print out the name of the file you are comparing.

find . \( -name '*.java' -o -name '*.sp' -o -name '*.spp' -o -name '*.pdc' -o -name '*.ty' -o -name '*.sql' -o -name '*.sf' \) -print0 |xargs -0 -I '{}' diff '{}' /path/to/dir2/'{}' >> /tmp/log
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37768577
hmm, don't know which version of find requires barckets for simple or'd arguments
and I don't see any missing arguments, it's more a personal favority to use -n 1 or -I (which implies -L 1, but is more difficult to write/understand ;-)
anyway, the result is the same
0
 

Author Comment

by:raaj4354
ID: 37769493
#!/bin/sh
cd /path/to/dir1/
for e in java sp spp pdc ty sql sf; do
    find . -name "*.$e" | while read f; do
          echo "=========== $f ">> /tmp/log
          diff "$f" /path/to/dir2/"$f" >> /tmp/log
    done
done

I used the above script and it worked .Sorry it was missing a quotation  mark(my bad). But the output showed a long list of " MISSING NEW LINE AT THE END OF FILE" in all java files under dir2. The log file was empty nothing was written into it.

Then I took a similar java file from each of the folder and compared it in notepad and they matched. Then i used the cmp command to compare those two java files(on AIX) and it showed that there was a space in second line in the file from dir2

. I deleted the space and compared it back on the aix box using the cmp command then it showed they were same.

Surprisingly the diff and cmp command worked differently on the same set of files.

Is there a way to ignore white spaces and compare only the code in two directories..?
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 37769881
> Surprisingly the diff and cmp command worked differently on the same set of files.
no suprise, as diff is line/word based and cmp makes binary compare

> Is there a way to ignore white spaces and compare only the code in two directories..?
man diff

see -w -W -B ...
0
 
LVL 19

Expert Comment

by:simon3270
ID: 37770022
@ahoffman - you need brackets in this particular "find" because of the "-print0" at the end.  In "find", an "and" operator has higher precedence than "or", so almost any "find" which includes "and" and "or" operators will need brackets to make sure that the right parameters are linked together.  The implied "and" between the last "or" expression (the -name "*.sf") and the -print0 means that the -print0 is only executed if the name matches *.sd - if it matches any other pattern in the list, the "find" does indeed end up with a "true" status for that file, but it doesn't apply its default "print" option because you have explicitly included a "print" argument.  As for "-n 1" and "-I '()'", the "-n 1" will indeed run the command once for each input file, but it will simply add the filename to the end of the command - in your case, this will run "diff /path/to/dir2 java/FoundFile.java", which is not a valid "diff" command.  I used "-I '{}'" because I needed to include the file name twice in the command - for example "diff java/FoundFile.java /path/to/dir2/java/FoundFile.java".

(sorry this is so long - there are subleties here which I couldn't describe more briefly!)
0
 
LVL 19

Accepted Solution

by:
simon3270 earned 500 total points
ID: 37770093
It sounds as though the files have some differences between them, but you want to ignore those differences (whitespace, line endings and so on).

There are several things you can do.  As @ahoffman says, options to "diff" will ignore whitespace (-b treats all whitespace sequences as being the same, -w seems to ignore all whitespace so that if one file has whitespace between two words and the other has those words next to each other, -w would say they are the same: I don't think -W and -B are in AIX).

You might consider pre-formatting the files before you compare them.  For example, this will avoid your "missing newline" message:
    grep '^' "$f" > /tmp/file1
    grep '^' /path/to/dir2/"$f" > /tmp/file2
    diff -w /tmp/file1 /tmp/file2 >> /tmp/log
(the missing newline is probably a file edited in Windows - Windows editors seem happy to end a file at the last character on the last line - UNIX/Linux editors will usually add a newline character to that last line - the "grep" command shown will find every line in the file, and has the side-effect that it adds that missing newline.

I have often found that one set of file has UNIX line ends (line feed) while the others have DOS/Windows line ends (carriage return + line feed).  A "dos2unix" command will sort out the DOS files, or use "tr -d '\015\032' as a filter added to the above grep commands - the \015 will remove carriage returns, and the \032 will remove the trailing Control-Z that DOS used to add to the end of text files.

I have also used this technique if I *know* there are differences between the files.  If I know that one set says "ERRORLOGFILE" and the other says "ERRORFILE", I would use "sed" to change the text in one file so that it matches the value in the other file in the /tmp file, so that the "diff" command would only find other differences, not the ones I know about.  If, for example, the dir1 files were DOS format with ERRORLOGFILE, and the dir2 ones were UNIX format with ERRORFILE, I might use:
    tr -d '\015\032' < "$f" | grep '^'| sed 's/ERRORLOGFILE/ERROFILE/g' > /tmp/file1
    grep '^' /path/to/dir2/"$f" > /tmp/file2
    diff /tmp/file1 /tmp/file2 >>/tmp/log

Unless you *know* that all files in dir1 have an equivalent in dir2, you might want to add a test.  This would be something like:

    find . -name "*.$e" | while read f
    do
        echo "======= $f" >> /tmp/log
        if [ -f /path/to/dir2/"$f" ]
        then
            : do diff as above
        else
            echo ERROR: Cannot find file $f in /path/to/dir2 >> /tmp/log
        fi

One last change would be to add diff's error messages to the log file - do this (assuming you have used the above "grep" commands) with:
    diff /tmp/file1 /tmp/file2 >> /tmp/log 2>&1
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
It’s 2016. Password authentication should be dead — or at least close to dying. But, unfortunately, it has not traversed Quagga stage yet. Using password authentication is like laundering hotel guest linens with a washboard — it’s Passé.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now