Solved

comparing two reports

Posted on 2009-07-14
8
273 Views
Last Modified: 2012-05-07
I need to compare two reports. Both the reports contains customer information but in different order. So if I run the diff command gives almost the entire file as difference becuase of difference in the ordering. Can you please suggest a way for comparison?
0
Comment
Question by:saibsk
8 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 24850481
Sort the reports into temporary files and run diff against these.
You will at least see differing lines, although out of context.
Use sort [filename] > [temp_filename] to sort
0
 
LVL 3

Expert Comment

by:glenthorne
ID: 24850482
Long story short, you will need to reformat the data from both reports so that they are (or could be) the same.  I like to use awk to rip out and format the data, then use the sort command to put the data in the right order for both files, and then you can use diff to see if things really are different.

For further help, it would be nice if there where a snippet of each file as well as which fields in each snippet where pertinent to the comparison.
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24850902
Depending on how the files are organized, you might be able to write a perl script that would read both files, and compare based on a unique key, like customer id.  If you are interested in this, post a sample of the files.
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 

Author Comment

by:saibsk
ID: 24852027
################Checking Data###################
Name: joe Smith
Account: 777333
City: New Jersey

07/13/2009         07/14/2009        
   MONDAY             TUESDAY        
 Withdrawal:      Deposit:
  $0.00           $0.00                                

Name: jack williams
Account: 777343
City: New Jersey

 07/13/2009         07/14/2009        
 MONDAY             TUESDAY            
 Withdrawal:      Deposit:                        
$1110.00           $2220.00



If one report has the data something like this the other report has the same data but the it could be in a different order e.g Jack williams would be first and joe smith second
if i am sorting the data it is not coming out correct. Please advise
0
 
LVL 84

Accepted Solution

by:
ozo earned 400 total points
ID: 24852147
You can sort like
 perl -e '$/="Name: ";chomp(@r=<>);s/\n+$// for @r;print "$/$_\n\n" for sort @r' <report >sortedreport
0
 
LVL 40

Expert Comment

by:omarfarid
ID: 24852150
with this type of data, it is better to load data into database and then compare records of both files
0
 

Author Comment

by:saibsk
ID: 24852856
Hi ozo, Your solution works for me but I have  two questions:
what about other lines in the file like the headers?
Additionally suppose one file contains data liek this
 MONDAY             TUESDAY    

and the other file does contain the same info but the data is shifted something like

     MONDAY             TUESDAY    It still gives the line as the difference between the two files.

As long as the data in the lines match i dont want the difference to show becuase of the spaces or change in alignment. Please advise.
0
 
LVL 39

Assisted Solution

by:Adam314
Adam314 earned 100 total points
ID: 24853941
The code ozo gave will sort the file, keeping records (delineated by "Name:") together.  
If you want to ignore whitespace when comparing, that would be a function of your differencing tool.  If your tool does not support this, you could maybe replace consecutive whitespace with a single whitespace.

Unix: perl -ibak -e 's/\s+/ /' filename.txt
Windows: perl -ibak -e "s/\s+/ /" filename.txt

0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java performance on Solaris - Managing CPUs There are various resource controls in operating system which directly/indirectly influence the performance of application. one of the most important resource controls is "CPU".   In a multithreaded…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question