?
Solved

Perl script to compare two csv files.

Posted on 2007-08-01
7
Medium Priority
?
1,769 Views
Last Modified: 2010-06-12
Subject: Perl script to compare two csv files.
I have two csv files named test1.csv and test2.csv
The data in the two files is in the following format.
string1              string2        string3
name(1)='abc'        n(1)=ab        num(1)=-20.123
name(2)='def'        n(2)=cd        num(2)=-30.134

...
name(999)='ghi'      n(999)=jk      num(999)=-100.23

The output file should have records,which are not same
in the format
test1.csv                                     test2.csv
string1        string2         string3        string1       string2       string3


Thanks in advance.
Begunix
0
Comment
Question by:Begunix
6 Comments
 
LVL 39

Expert Comment

by:Adam314
ID: 19612223
Are these supposed to be the lines of the file?  This doesn't look like csv data.
name(1)='abc'        n(1)=ab        num(1)=-20.123
name(2)='def'        n(2)=cd        num(2)=-30.134

Is it tab seperated?  Are records considered by their number (eg, the number in parens)?  Or entire lines?
0
 
LVL 1

Expert Comment

by:khota001
ID: 19613917
I am assuming you want to compare test1 and test2 line by line and find out differences in them and print out the lines which are different. .(comparing string1 to string1 and so on...)

#Read both files in array
open(T1, "test1.csv");
open(T2, "test2.csv");
@test1=<T1>;
@test2=<T2>;

for ($i = 0; $i<999;$i++)
{
    @line1 = split($test1[i]);
    @line2 = split($test2[i]);
    if(($line1[0] ne $line2[0]) || ($line1[1] ne $line2[1]) || ($line1[2] ne $line2[3]))
    {
        print $test1[i] $test2[i];
    }
}

This should work...
0
 

Author Comment

by:Begunix
ID: 19670899
I apologize for the delay in replying.


I have two files test1.csv and test2.csv
test1.csv has data as follows:
test1_num(1)='a'
test1_num(4)=' '
test1_num(23)='bc'
test1_num(100)='def'



test2.csv has data as follows:
mtest2_num(1)='abc'
mtest2_num(2)='bc'
mtest2_num(3)='c'
mtest2_num(4)='def'
...mtest2_num(23)='jk'
...mtest2_num(50)='lm'
...mtest2_num(90)='pr'
...mtest2_num(100)='st'


On comparing test1.csv with test2.csv,

the output should be
mtest2_num(1)='abc'
mtest2_num(4)='def'
mtest2_num(23)='jk'
mtest2_num(100)='st'


Thanks in advance
Begunix
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 39

Expert Comment

by:Adam314
ID: 19671796
How do you determine the output?
0
 
LVL 85

Accepted Solution

by:
ozo earned 2000 total points
ID: 19672040
open F1,"<test1.csv" or die $!;
while( <F1> ){
    $n{$1}=$2 if /\((\d+)\)=(\S+)/
}
close F1;
open F2,"<test2.csv" or die $!;
while( <F2> ){
    print if /\((\d+)\)=(\S+)/ && $n{$1} && $n{$1} ne $2;
}
close F2;
0
 

Author Comment

by:Begunix
ID: 19672138
Output is determined by matching pattern
num(1)
num(4)
num(23)
num(100)
from test1.csv with test2.csv data.

Thanks
Begunix
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

621 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question