Jason_Sutiono
asked on
Comparing 2 files and display the difference
Hi Experts,
I'm trying to compare 2 sets of files to get the difference in another file using perl.
I have written the code but it takes forever to run as the input file is quite large.
Being new to perl, what I have written is structured inefficiently.
The code is attached in the file.
Here are my input files:
stkturnsalesv3.csv (file1)
11-APR-2012|002J011|3223|1 |001036|W0 35|S
26-MAR-2012|0020L36|3264|1 |0020L36|W 007|S
10-APR-2012|0020L36|3264|1 |0020L36|W 007|S
02-APR-2012|002J011|3223|1 |002J011|W 007|S
stkturnsohv3.csv (file2)
TEL|002N2D0|S|3544|1|0|0|7 2|002N2D0| W007| |S
TPN|002N2D0|S|3430|6|0|0|8 3.3333|002 N2D0|W007| |S
TRH|002N2D0|S|3528|0|9|0|7 2|002N2D0| W007| |S
TWG|002N2D0|S|3732|0|7|0|7 2|002N2D0| W007| |S
Basically, I am trying to find out the difference between file 1 and file 2 based on column 2 and column 3 of file 1.
In other words, if column 2 and column 3 in file 1 does not match column 2 and column 4 in file 2, the difference is displayed in an output file in the following format:
Final Output
-------------------------- ---------- ---------- ----
0020L36|3264|0|0|2|0|0
002J011|3223|0|0|2|0|0
-------------------------- ---------- ---------- -----
(Col2 from file 1)|(Col3 from file 1|Default0|Default0|Quanti ty Added}Default 0|Default0
Note that only data from file 1 is displayed.
Also, if column 2 and column 3 in file 1 matches, I need to add the quantity (column 4) instead of repeating the 2 lines.
26-MAR-2012|0020L36|3264|1 |0020L36|W 007|S
10-APR-2012|0020L36|3264|1 |0020L36|W 007|S
Hence, the output would be:
0020L36|3264|0|0|2|0|0
Instead of:
0020L36|3264|0|0|1|0|0
0020L36|3264|0|0|1|0|0
Thank you in advance!!
Looking forward to the responses.
This is not a homework btw.
Regards,
Jason
I'm trying to compare 2 sets of files to get the difference in another file using perl.
I have written the code but it takes forever to run as the input file is quite large.
Being new to perl, what I have written is structured inefficiently.
The code is attached in the file.
Here are my input files:
stkturnsalesv3.csv (file1)
11-APR-2012|002J011|3223|1
26-MAR-2012|0020L36|3264|1
10-APR-2012|0020L36|3264|1
02-APR-2012|002J011|3223|1
stkturnsohv3.csv (file2)
TEL|002N2D0|S|3544|1|0|0|7
TPN|002N2D0|S|3430|6|0|0|8
TRH|002N2D0|S|3528|0|9|0|7
TWG|002N2D0|S|3732|0|7|0|7
Basically, I am trying to find out the difference between file 1 and file 2 based on column 2 and column 3 of file 1.
In other words, if column 2 and column 3 in file 1 does not match column 2 and column 4 in file 2, the difference is displayed in an output file in the following format:
Final Output
--------------------------
0020L36|3264|0|0|2|0|0
002J011|3223|0|0|2|0|0
--------------------------
(Col2 from file 1)|(Col3 from file 1|Default0|Default0|Quanti
Note that only data from file 1 is displayed.
Also, if column 2 and column 3 in file 1 matches, I need to add the quantity (column 4) instead of repeating the 2 lines.
26-MAR-2012|0020L36|3264|1
10-APR-2012|0020L36|3264|1
Hence, the output would be:
0020L36|3264|0|0|2|0|0
Instead of:
0020L36|3264|0|0|1|0|0
0020L36|3264|0|0|1|0|0
Thank you in advance!!
Looking forward to the responses.
This is not a homework btw.
Regards,
Jason
#!/usr/bin/perl -s
$f1 = 'stkturnsalesv3.csv';
open FILE1, "$f1" or die "Could not open file file2.csv\n";
$f2= 'stkturnsohv3.csv';
open FILE2, "$f2" or die "Could not open file2.csv\n";
$outfile = 'test3.csv';
my @outlines;
my @line;
my %a;
my %qty;
my @temparray;
open(INFO, ">$outfile") or die "$outfile $!";
foreach (<FILE1>) {
my @col = split /\|/;
$y = 0;
$outer_text = $col[1].$col[2];
$qty{$col[1]}{$col[2]} += $col[3];
seek(FILE2,0,0);
foreach (<FILE2>) {
my @colb = split /\|/;
$inner_text = $colb[1].$colb[3];
if($outer_text eq $inner_text) {
$y = 1;
last;
}
}
if($y != 1) {
push(@temparray, "$outer_text|$col[1]|$col[2]|$col[3]\n");
}
}
my %temparrayqty;
for (@temparray){
my($col1e,$col2e,$col3e,$col4e)=split(/\|/);
$temparrayqty{$col2e}{$col3e} += $col4e;
}
foreach $stockcode (sort keys %temparrayqty) {
foreach $warehouse (keys %{$temparrayqty{$stockcode}})
{
my $c = "0|0|";
my $d = "0|0";
print INFO "$stockcode|";
print INFO "$warehouse|";
print INFO "$c";
print INFO "$temparrayqty{$stockcode}{$warehouse}|";
print INFO "$d", "\n";
}
}
close(STKTURNTEST);
close INFO;
close FILE1;
close FILE2;
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
NOTE: If you replace the words 'FILE1' and 'FILE2' in lines 5 and 8 in the above batch file code with '%~1' and '%~2' respectively, then you can start the batch file passing both filenames as parameters like this:
FILEDIFF file1 file2
(You will have to put double-quote around file1 and file2 if the fienames contain spaces).
See the modified code below (I've done it for you so don't worry):
FILEDIFF file1 file2
(You will have to put double-quote around file1 and file2 if the fienames contain spaces).
See the modified code below (I've done it for you so don't worry):
@echo off
setlocal enabledelayedexpansion
for /f "tokens=1 delims==" %%A in ('2^>nul set diff[') do set "%%A="
for /f "tokens=2,3 usebackq delims=|" %%A in ("%~1") do (
set "flag="
if not defined diff[%%A][%%B] set diff[%%A][%%B]=0
for /f "tokens=2,4 usebackq delims=|" %%a in ("%~2") do if "%%A" equ "%%a" if "%%B" equ "%%b" set flag=1
if not defined flag set /a diff[%%A][%%B]+=1
if !diff[%%A][%%B]! equ 0 set "diff[%%A][%%B]="
)
echo --------------------------------------------------
for /f "tokens=2,3 delims=[]" %%A in ('set diff[') do echo %%A^|%%B^|0^|0^|2^|0^|0
echo --------------------------------------------------
ASKER
Thanks ozo. You're the legend! That did the trick!!
ASKER
Thanks Paultomasi. I do use batch file occasionally. Would come in handy someday. I cant seem to give points to assisted solutions now :(
Please confirm whether this is of any use to you and whether it returns the results you expect it to:
Copy and paste the code into Notepad and save it as say, 'FILEDIFF.BAT' in the same folder where your two files are. Then fire up a DOS session, navigate to where your files are and start the batch file like this:
FILEDIFF
NOTE: Don't forget to change 'FILE1' and 'FILE2' in lines 5 and 8 to the names of your own files.
Open in new window