• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 646
  • Last Modified:

Find missing files using PHP

I have a folder "e:\my documents\myfolder" with about 400000 files.

The folder has only 2 types of files extentions:
1. *.txt
2. *.key

Each .txt file must have a corresponding .key file

e.g. abc.txt should also have a file called abc.key in the same folder

But I know there are some .txt files who do not have a .key file.

I need to find those .txt files and move them to a folder called "NoLink".

Please help me with a php script which will take care of this for such a large file base.
0
nainil
Asked:
nainil
3 Solutions
 
GhostScripterCommented:
Morning!

I assume the following directory structure for the script below:

+ base dir where the script is
++ pairs => here are all *.txt and *.key files
++ nolink => here should all *.txt files without a *.key partner go

<?php

$_ORIG_DIR = 'pairs';
$_NOLINK_DIR = 'nolink';

// walk over all *.txt files in the base dir
foreach(glob($_ORIG_DIR.'/*.txt') as $file) {
	// check if the *.txt file has a *.key partner
	// if not move the *.txt file to nolink dir

	// extract only the filename without ending
	$filename = substr($file, strpos($file, '/')+1, (strpos($file, '.')-2 - strpos($file, '/')+1));
	
	// check if the file has a partner 
	// if no partner is found the file is copied to unlin dir and deleted from the original dir
	if(glob($_ORIG_DIR.'/'.$filename.'.key') != null) {
		echo "Pair found<br />";
	} else {
		copy($file,$_NOLINK_DIR.'/'.$filename.'.txt');
		unlink($file);
		echo "File <b>$filename.txt</b> moved";
	}
}
?>

Open in new window


This script is just a fast shoot. It worked for me on my LAMP system.

Greets,
GhostScripter
0
 
Marco GasiFreelancerCommented:
You can try this untested code:
<?php
define('DS', DIRECTORY_SEPARATOR);
$filelist = array();
$fromDir = 'your_original_dir';
$toDir = 'your_original_dir/NoLink';

if ($handle = opendir("your_original_dir")) {
    while (($file = readdir($handle)) !==false) {
        if (substr($file,0,1) != "." && !is_dir($file))
		    $fileNameParts = explode('.', $file);
			if ($fileNameParts[1]] == 'txt' && !file_exists($fileNameParts[0])){
				$filelist[] = $file;
			}
        }
    }
    closedir($handle);
}
mkdir($toDir);
foreach ($filelist as $file){
	copy($fromDir . DS . $file, $toDir . DS . $file);
	unlink($fromDir . DS . $file);
}

Open in new window


Please if want to test it, comment the unlink statement and echo results to check if it works fine this way:
<?php
define('DS', DIRECTORY_SEPARATOR);
$filelist = array();
$fromDir = 'your_original_dir';
$toDir = 'your_original_dir/NoLink';

if ($handle = opendir("your_original_dir")) {
    while (($file = readdir($handle)) !==false) {
        if (substr($file,0,1) != "." && !is_dir($file))
		    $fileNameParts = explode('.', $file);
			if ($fileNameParts[1]] == 'txt' && !file_exists($fileNameParts[0])){
				$filelist[] = $file;
			}
        }
    }
    closedir($handle);
}
echo "<pre>";
var_dump($filelist);
echo "</pre>";

mkdir($toDir);
foreach ($filelist as $file){
	copy($fromDir . DS . $file, $toDir . DS . $file);
//	unlink($fromDir . DS . $file);
}

Open in new window


Unfortunately, I can't now test it.

Cheers
0
 
InsoftserviceCommented:
pls try out this one
<?php



$path   = "/var/tmp/";
$nolink ="/var/tmp/NoLink/";

$data = getDirList($path);
if(is_array($data))
{
    $notavail =array();
	foreach($data as $key =>$val)
	{
	  if(strpos($val,".txt"))
	  {
	    $replace = str_replace(".txt",'.key',$val);
		if(!file_exists($path.$replace))
		{ 
		  echo $path.$val;
		  echo $nolink.'/'.$val;
		  copy($path.$val,$nolink.'/'.$val);
		  unlink($path.$val);
		}
		
	  }
	} 
}
echo "<pre>";print_r($data);
echo "<pre>";print_r($notavail);



	function getDirList($dirpath)
		{
			if (is_dir($dirpath))
			{
				if ($dh = opendir($dirpath))
				{
					$i = 0;
					while (($file = readdir($dh)) !== false)
					{
						if ($file != "." && $file != ".." && !is_dir($dirpath.'/'.$file))
						{
							$tmp_arr_dirlist[$i] = $file;
							$i++;
						}
					}
				}
				if(is_array($tmp_arr_dirlist))
				{
					sort($tmp_arr_dirlist);
					closedir($dh);
					return $tmp_arr_dirlist;
				}
			}
			return false;
		}

?>

Open in new window

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
käµfm³d 👽Commented:
Yet another  : )

<?php

    function compareFilenames ($file1, $file2) { return strcmp(pathinfo($file1, PATHINFO_FILENAME), pathinfo($file2, PATHINFO_FILENAME)); }

    $txts = glob('test/*.txt');
    $keys = glob('test/*.key');

    $missing_key = array_udiff($txts, $keys, "compareFilenames");

    foreach ($missing_key as $txt)
    {
        $fileParts = pathinfo($txt);
        $newFilename = 'NoLink/' . $fileParts['basename'];
        echo "<li>" . $txt . "</li>";
        copy($txt, $newFilename);
        unlink($txt);
    }

?>

Open in new window

0
 
Ray PaseurCommented:
with about 400000 files ... such a large file base
That is not really a very large number of files, but just to be on the safe side, consider using set_time_limit(1) somewhere in the looping process.  That may help you avoid a timeout in your script.

Best to all, ~Ray
0
 
Marco GasiFreelancerCommented:
I just tested my code and this is the working one:

<?php

define('DS', DIRECTORY_SEPARATOR);
$filelist = array();
$fromDir = 'actualDir';
$toDir = 'actualDir' . DS . 'NoLink';

if ($handle = opendir($fromDir)) {
  while (($file = readdir($handle)) !== false) {
    if ($file != "." && $file != ".." && !is_dir($file)) {
      $fileNameParts = explode('.', $file);
      echo $file . "<br>";
      if ($fileNameParts[1] == 'txt' && !file_exists($fileNameParts[0] . '.key')) {
        $filelist[] = $file;
        echo " only txt file is $file <br>";
      }
    }
  }
  closedir($handle);
}
echo "<pre>";
var_dump($filelist);
echo "</pre>";
if (!file_exists($toDir)) mkdir($toDir);
foreach ($filelist as $file) {
  copy($fromDir . DS . $file, $toDir . DS . $file);
//  unlink($fromDir . DS . $file);
}

Open in new window


Uncomment unlink line to delete all orphan txt files.

Cheers
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now