egoselfaxis
asked on
Need help updating PHP script that counts duplicate emails in 2 uploaded CSV files
I've been asked to revise to a custom PHP script that I developed for someone (about a year ago) that counts duplicate email addresses that are in 2 uploaded CSV files.
Right now, if there are duplicates in list #1, it reports those dupes. What I need to do is update it so that it only reports the duplicates that are BETWEEN the two lists - and NOT the duplicates that are within either list.
I thought that it'd be quick and easy, but I can't seem to figure it out. I'm thinking there might some way to filter the dupes out of each of the CSV files individually before I merge them, perhaps?
My PHP code and is below, and some sample CSV files are attached. Any help would be appreciated.
Thanks!
- Yvan
sample2.csv
Right now, if there are duplicates in list #1, it reports those dupes. What I need to do is update it so that it only reports the duplicates that are BETWEEN the two lists - and NOT the duplicates that are within either list.
I thought that it'd be quick and easy, but I can't seem to figure it out. I'm thinking there might some way to filter the dupes out of each of the CSV files individually before I merge them, perhaps?
My PHP code and is below, and some sample CSV files are attached. Any help would be appreciated.
Thanks!
- Yvan
<?php
$csv1filename = $_FILES['csv1']['tmp_name'];
$ext1 = strtoupper(pathinfo($_FILES['csv1']['name'], PATHINFO_EXTENSION));
$csv2filename = $_FILES['csv2']['tmp_name'];
$ext2 = strtoupper(pathinfo($_FILES['csv2']['name'], PATHINFO_EXTENSION));
if ($ext1 == 'CSV' && $ext2 == 'CSV') {
$csv1 = '/home/expert/public_html/dupes/csv1.csv';
move_uploaded_file($csv1filename, $csv1);
$csv2 = '/home/expert/public_html/dupes/csv2.csv';
move_uploaded_file($csv2filename, $csv2);
// CREATE THE EMPTY ARRAY
$raw_array = array();
$csv1 = file($csv1, FILE_IGNORE_NEW_LINES + FILE_SKIP_EMPTY_LINES);
$csv2 = file($csv2, FILE_IGNORE_NEW_LINES + FILE_SKIP_EMPTY_LINES);
$raw_array = array_merge($csv1, $csv2);
// FUNCTION TO COUNT DUPLICATE EMAILS IN THE ARRAY
function array_not_unique($raw) {
$new = array_count_values($raw);
foreach ($new as $key => $val) {
if ($val < 2) unset($new[$key]);
}
return $new;
}
$common = array_not_unique($raw_array);
// DELETE EXPORT CSV FILE IF IT ALREADY EXISTS
if ( file_exists("/home/expert/public_html/dupes/duplicates.csv") ) {
unlink ("/home/expert/public_html/dupes/duplicates.csv");
}
// OPEN FILE FOR WRITING AND ADD COLUMN HEADERS
$fd = fopen("/home/expert/public_html/dupes/duplicates.csv", "a");
fwrite($fd, "EMAIL\n");
// DISPLAY THE NUMBER OF DUPES FOUND IN THE ARRAY
$total = 0;
echo "<pre style='text-align:left;line-height:45px;'>";
foreach ($common as $x => $n) {
$total++;
// LOOP THROUGH DATA AND APPEND DUPLICATE EMAILS TO CSV FILE
fwrite($fd, $x . "\n");
}
// CLOSE THE FILE
fclose($fd);
echo "</pre>";
echo "<br />A Total of <strong style='text-align:center;background-color:#FBCB45;padding:3px;'>$total</strong> duplicate email addresses were found.<br /><br /><br />";
echo "Click <strong><a href=\"duplicates.csv\" target=\_blank\">here</a></strong> to download a CSV file which contains<br />the duplicate email addresses<br /><br /><br />";
echo '<a href="/dupes/" style="font-size:18px;text-decoration:none;font-weight:bold;color:blue;"><< Back</a>';
unlink('/home/expert/public_html/dupes/csv1.csv');
unlink('/home/expert/public_html/dupes/csv2.csv');
} else {
echo 'Improper file type uploaded.';
}
?>
sample1.csvsample2.csv
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Got a Neglected Question Alert on this one. Do you still want a hand or shall we close it out? Please let us know! Thanks, ~Ray
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Sorry guys -- my client went awol on this one. I'm just going to close this out and award some points.
Cheers,
- Yvan
Cheers,
- Yvan
ASKER
(My solution worked)