Solved

comparing two reports

Posted on 2009-07-14
8
276 Views
Last Modified: 2012-05-07
I need to compare two reports. Both the reports contains customer information but in different order. So if I run the diff command gives almost the entire file as difference becuase of difference in the ordering. Can you please suggest a way for comparison?
0
Comment
Question by:saibsk
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 24850481
Sort the reports into temporary files and run diff against these.
You will at least see differing lines, although out of context.
Use sort [filename] > [temp_filename] to sort
0
 
LVL 3

Expert Comment

by:glenthorne
ID: 24850482
Long story short, you will need to reformat the data from both reports so that they are (or could be) the same.  I like to use awk to rip out and format the data, then use the sort command to put the data in the right order for both files, and then you can use diff to see if things really are different.

For further help, it would be nice if there where a snippet of each file as well as which fields in each snippet where pertinent to the comparison.
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24850902
Depending on how the files are organized, you might be able to write a perl script that would read both files, and compare based on a unique key, like customer id.  If you are interested in this, post a sample of the files.
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 

Author Comment

by:saibsk
ID: 24852027
################Checking Data###################
Name: joe Smith
Account: 777333
City: New Jersey

07/13/2009         07/14/2009        
   MONDAY             TUESDAY        
 Withdrawal:      Deposit:
  $0.00           $0.00                                

Name: jack williams
Account: 777343
City: New Jersey

 07/13/2009         07/14/2009        
 MONDAY             TUESDAY            
 Withdrawal:      Deposit:                        
$1110.00           $2220.00



If one report has the data something like this the other report has the same data but the it could be in a different order e.g Jack williams would be first and joe smith second
if i am sorting the data it is not coming out correct. Please advise
0
 
LVL 84

Accepted Solution

by:
ozo earned 400 total points
ID: 24852147
You can sort like
 perl -e '$/="Name: ";chomp(@r=<>);s/\n+$// for @r;print "$/$_\n\n" for sort @r' <report >sortedreport
0
 
LVL 40

Expert Comment

by:omarfarid
ID: 24852150
with this type of data, it is better to load data into database and then compare records of both files
0
 

Author Comment

by:saibsk
ID: 24852856
Hi ozo, Your solution works for me but I have  two questions:
what about other lines in the file like the headers?
Additionally suppose one file contains data liek this
 MONDAY             TUESDAY    

and the other file does contain the same info but the data is shifted something like

     MONDAY             TUESDAY    It still gives the line as the difference between the two files.

As long as the data in the lines match i dont want the difference to show becuase of the spaces or change in alignment. Please advise.
0
 
LVL 39

Assisted Solution

by:Adam314
Adam314 earned 100 total points
ID: 24853941
The code ozo gave will sort the file, keeping records (delineated by "Name:") together.  
If you want to ignore whitespace when comparing, that would be a function of your differencing tool.  If your tool does not support this, you could maybe replace consecutive whitespace with a single whitespace.

Unix: perl -ibak -e 's/\s+/ /' filename.txt
Windows: perl -ibak -e "s/\s+/ /" filename.txt

0

Featured Post

[Live Webinar] The Cloud Skills Gap

As Cloud technologies come of age, business leaders grapple with the impact it has on their team's skills and the gap associated with the use of a cloud platform.

Join experts from 451 Research and Concerto Cloud Services on July 27th where we will examine fact and fiction.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question