Link to home
Start Free TrialLog in
Avatar of Yuri Boyz
Yuri BoyzFlag for Uganda

asked on

Delete Duplicate records from Database in PHP

I am reading image files from a folder and save the results in the database. There are multiple duplicate images in a folder (same image but different filename).
I am getting md5 of a file to marked the file as duplicate.

While saving the records in database how I can ignore the duplicate files to not save in Database.

This is my code:

<?php

require_once 'db.php';

$dir = '../../../../../sites/default/files/';


$exts = array('gif', 'jpg', 'jpeg', 'png');

if ( is_dir($dir) ):

    // Set up the Iterator
    
	$directory = new RecursiveDirectoryIterator($dir, FilesystemIterator::SKIP_DOTS); //skip dots and double dots hidden folders
    $iterator = new RecursiveIteratorIterator($directory);


    // Clear the table
    $conn->query("TRUNCATE TABLE searchindex_images");
    // Prepare the INSERT query   
	$insert = $conn->prepare("INSERT INTO searchindex_images (dir, img_names, lang,file_date) VALUES (? , ? , ? , ?)");
   
	$insert->bind_param("ssss", $path, $img, $language,$myhash);   	
	
	
	
$foldersToSkip = ['css','default_images','docs'];

foreach ($iterator as $file):
    $folders = explode( DIRECTORY_SEPARATOR, $file->getPath() ); // split the path into an array.	

    if ( ! empty( array_intersect($foldersToSkip, $folders) ) ) { // see if any folder is matched with $foldersToSkip
        // let's skip this folder
		print "<br>Folder SKipped";
        continue;
    }
	else{
		
			//// This code only works on Linux OS. It checks for arabic characters in filenames ///// 
			$match = preg_match( "/\p{Arabic}/u", $img ); // see if we have Arabic characters
			$language = $match ? "Arabic" : "English"; // if $match is true, set $language to Arabic, otherwise set it to English
									////// CODE ENDS /////////				
				
			
			$path = $file->getPath().'/'; // Get the path
			$img = $file->getFilename(); // Get the filename
			$file_date = $file->getMTime(); // Get the filename
			$file_date_c = $file->getCTime(); // Get the filename
			$myhash = md5_file($path.$img );

            $insert->execute(); // Run the query
            echo "<br>{$path}"; // Output the filename
			echo "<br>{$img}<br>"; // Output the filename					
			
			echo "<br>FILE HASH = ".$myhash;



			echo "<br>===================================";	
	}

    


endforeach;   

else:

    echo "Directory not found!";

endif;				
				
?>

Open in new window

Avatar of Martyn Spencer
Martyn Spencer
Flag of United Kingdom of Great Britain and Northern Ireland image

Why not store the md5 of the file in the database as well? Then prior to inserting you can compare the md5 of the new record to existing records. If you check this question, you can also ask mysql not to insert a record if a duplicate already exists. https://stackoverflow.com/questions/1361340/how-to-insert-if-not-exists-in-mysql
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.