Link to home
Start Free TrialLog in
Avatar of egoselfaxis
egoselfaxis

asked on

Need help updating PHP script that counts duplicate emails in 2 uploaded CSV files

I've been asked to revise to a custom PHP script that I developed for someone (about a year ago) that counts duplicate email addresses that are in 2 uploaded CSV files.    

Right now, if there are duplicates in list #1, it reports those dupes.  What I need to do is update it so that it only reports the duplicates that are BETWEEN the two lists - and NOT the duplicates that are within either list.  

I thought that it'd be quick and easy, but I can't seem to figure it out. I'm thinking there might some way to filter the dupes out of each of the CSV files individually before I merge them, perhaps?

My PHP code and is below, and some sample CSV files are attached.  Any help would be appreciated.

Thanks!
- Yvan


<?php		
	
	$csv1filename = $_FILES['csv1']['tmp_name'];

	$ext1 = strtoupper(pathinfo($_FILES['csv1']['name'], PATHINFO_EXTENSION));							
	
	$csv2filename = $_FILES['csv2']['tmp_name'];
	
	$ext2 = strtoupper(pathinfo($_FILES['csv2']['name'], PATHINFO_EXTENSION));		

	if ($ext1 == 'CSV' && $ext2 == 'CSV')	{	

		$csv1 = '/home/expert/public_html/dupes/csv1.csv';					

		move_uploaded_file($csv1filename, $csv1);

		$csv2 = '/home/expert/public_html/dupes/csv2.csv';	

		move_uploaded_file($csv2filename, $csv2);					

		// CREATE THE EMPTY ARRAY
		
		$raw_array = array();				
		
		$csv1 = file($csv1, FILE_IGNORE_NEW_LINES + FILE_SKIP_EMPTY_LINES);
		
		$csv2 = file($csv2, FILE_IGNORE_NEW_LINES + FILE_SKIP_EMPTY_LINES);

		$raw_array = array_merge($csv1, $csv2); 					
		
		// FUNCTION TO COUNT DUPLICATE EMAILS IN THE ARRAY
		
		function array_not_unique($raw) {
			$new = array_count_values($raw);				
			foreach ($new as $key => $val) {
			   if ($val < 2) unset($new[$key]);
			}				
			return $new;				
		}		

		$common = array_not_unique($raw_array);

		// DELETE EXPORT CSV FILE IF IT ALREADY EXISTS	

		if ( file_exists("/home/expert/public_html/dupes/duplicates.csv") ) {
			unlink ("/home/expert/public_html/dupes/duplicates.csv");
		}	
		
		// OPEN FILE FOR WRITING AND ADD COLUMN HEADERS
		
		$fd = fopen("/home/expert/public_html/dupes/duplicates.csv", "a");
		
		fwrite($fd, "EMAIL\n");				

		// DISPLAY THE NUMBER OF DUPES FOUND IN THE ARRAY
		
		$total = 0;
		
		echo "<pre style='text-align:left;line-height:45px;'>";					
		
		foreach ($common as $x => $n) {
			
			$total++;							
			
			// LOOP THROUGH DATA AND APPEND DUPLICATE EMAILS TO CSV FILE					

			fwrite($fd, $x . "\n");	
			
		}	
		
		// CLOSE THE FILE
		
		fclose($fd);
		
		echo "</pre>";
		
		echo "<br />A Total of <strong style='text-align:center;background-color:#FBCB45;padding:3px;'>$total</strong> duplicate email addresses were found.<br /><br /><br />";
		
		echo "Click <strong><a href=\"duplicates.csv\" target=\_blank\">here</a></strong> to download a CSV file which contains<br />the duplicate email addresses<br /><br /><br />";	
						
		echo '<a href="/dupes/" style="font-size:18px;text-decoration:none;font-weight:bold;color:blue;">&lt;&lt; Back</a>';
		
		unlink('/home/expert/public_html/dupes/csv1.csv');
		
		unlink('/home/expert/public_html/dupes/csv2.csv');							

	} else {

		echo 'Improper file type uploaded.';
		
	}	
	
?>	

Open in new window

sample1.csv
sample2.csv
ASKER CERTIFIED SOLUTION
Avatar of egoselfaxis
egoselfaxis

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Got a Neglected Question Alert on this one.  Do you still want a hand or shall we close it out?  Please let us know!  Thanks, ~Ray
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of egoselfaxis
egoselfaxis

ASKER

Sorry guys -- my client went awol on this one.  I'm just going to close this out and award some points.

Cheers,
- Yvan
(My solution worked)