[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 241
  • Last Modified:

Search File Names in Directories (specific) and List them as per Relevance / Most keyword match.

How to search for file names inside some specified Directories and display/list in such an order where most of the search keywords found in the file name. Here file means .htm, .php only (Excluding folder names)

Here is an example. Suppose I have seven files in three different Directories (Say under A, B and C). And the files are: 1) My name is Don.htm, 2) Sweet Berry.php., 3 ) Unnamed Prisorer.htm, 4) Change File Names.php, 5) Don the Dog.htm, 7)  Name of the Sweet Dog.php

Now someone Search with the Keyword: SWEET DOG.          

The result should appear something like this (With most keyword match first):

7) Name of the Sweet Dog.php,
2) Sweet Berry.php
5) Don the Dog.htm

Restricting the no of result in a page (Paginating the result) would be appreciated.

The short code the better :) for a bigginer like me. I have already tried Bn_Search code from Planet Source Code (Title: Meta Search Magic Pro!) for having some concept but it did not work for me.

Waiting for your expert's comments.

Regards


0
softpro2k
Asked:
softpro2k
  • 9
  • 8
1 Solution
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
can there be subdirectories?
I assume the search should be case insensitive?
0
 
softpro2kAuthor Commented:
Yes,  there can be sub directories and the search is case insensetive.

The file name/keyword searching / matching I can do. but the order in which the result should be displayed is what i need.

Hope to receive your comments. Regards.

0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
please check out the below script. it uses a recursive function, which returns some more data that actually needed here, but I use that sort of functions for other things also (keeping it reusable).

then, the code below the function actually extracts the files from that array, comparing to the keywords list, and setting up the $arr_matches array. the ksort() function will ensure the values are sorted ascending on the match count, and the loop below will return the listing in the reverse order to get the highest matches first.

have fun

<?php
 
 function scan_directory_recursively($directory, $filter=FALSE)
 {
     $directory_tree = array();
 
     // if the path has a slash at the end we remove it here
     if(substr($directory,-1) == '/')
     {
         $directory = substr($directory,0,-1);
     }
  
     // if the path is not valid or is not a directory ...
     if(!file_exists($directory) || !is_dir($directory))
     {
         // ... we return false and exit the function
         return FALSE;
  
     // ... else if the path is readable
     }elseif(is_readable($directory))
     {
         // we open the directory
         $directory_list = opendir($directory);
  
         // and scan through the items inside
         while (FALSE !== ($file = readdir($directory_list)))
         {
             // if the filepointer is not the current directory
             // or the parent directory
             if($file != '.' && $file != '..')
             {
                 // we build the new path to scan
                 $path = $directory.'/'.$file;
  
                 // if the path is readable
                 if(is_readable($path))
                 {
                     // we split the new path by directories
                     $subdirectories = explode('/',$path);
  
                     // if the new path is a directory
                     if(is_dir($path))
                     {
                         // add the directory details to the file list
                         $tmp = scan_directory_recursively($path, $filter);
                         $directory_tree = $directory_tree + $tmp;
                     // if the new path is a file
                     }elseif(is_file($path))
                     {
                         // get the file extension by taking everything after the last dot
                         $extension = end(explode('.',end($subdirectories)));
  
                         // if there is no filter set or the filter is set and matches
                         if($filter === FALSE || $filter == $extension)
                         {
                             // add the file details to the file list
                             $directory_tree[] = array(
                                 'path'      => $path,
                                 'name'      => end($subdirectories),
                                 'extension' => $extension,
                                 'size'      => filesize($path),
                                 'kind'      => 'file');
                         }
                     }
                 }
             }
         }
         // close the directory
         closedir($directory_list); 
  
         // return file list
         return $directory_tree;
  
     // if the path is not readable ...
     }else{
         // ... we return false
        return FALSE;    
    }
 }
 // ------------------------------------------------------------
 
 
  $res = scan_directory_recursively(".");  
  $words = explode(" ", "shadow photo _M _J" );
 
  $arr_matches = array();
  
 
  foreach ( $res as $f )
  {
    print "{$f['path']}:<br>";
 
    $count_matches = 0;
    foreach($words as $keyword)
    {
      if (strpos(strtolower($f['path']), strtolower($keyword)) !== FALSE)
      { $count_matches ++; }
    }
 
    if ($count_matches>0)
    {
      if (!isset($arr_matches[$count_matches])) $arr_matches[$count_matches] = array();
      $arr_matches[$count_matches][] = $f;
    }
  } 
 
  ksort($arr_matches);
 
 
  print "<br><hr><br>";
  foreach ($words as $keyword)
  {
    print "$keyword<br>";
  }
  
  print "<br><hr><br>";
 
  $listing = "";  
  foreach($arr_matches as $count => $results )
  {
    $listing_tmp = "$count matches:<br>";
 
    foreach($results as $file)
    {
      $listing_tmp .= "{$file['path']}<br>";
 
    }
 
    $listing = "$listing_tmp<hr>$listing";
  }
 
  print $listing;
 
  print "<br><hr><br>";
  print_r($arr_matches);
 
  print "<br><hr><br>";
  print_r($res);
 
?>

Open in new window

0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
softpro2kAuthor Commented:
AngelIII.

I have tested your code. Its working and the result listing is very near as i wanted.

As I am a bigginer, may i request you to modify the code so that :

1) It displays the exact matches first, then most keyword matches (That your code does).
2) It displays only the file name (without folder paths)
3) It DONT display file extension.
4) It search in some specific kind of files (Like .txt, .dat, .php3, etc)
5) Split the results in several pages (Pagination) based on No. of result I want to show.

May be i am asking for a lot. But ..... you see.... :)

Have a merry Kristmas.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
2) It displays only the file name (without folder paths)
3) It DONT display file extension.
change:
      $listing_tmp .= "{$file['path']}<br>";
into:
      $listing_tmp .=  substr($file['filename'],0, strlen($file['filename']) - strlen($file['extension'])  )  ."<br>";

and:
$directory_tree[] = array(
                                 'path'      => $path,
                                 'name'      => end($subdirectories),
                                 'extension' => $extension,
                                 'size'      => filesize($path),
                                 'kind'      => 'file');
into:
$directory_tree[] = array(
                                 'path'      => $path,
                                 'filename'      => $file,
                                 'name'      => end($subdirectories),
                                 'extension' => $extension,
                                 'size'      => filesize($path),
                                 'kind'      => 'file');


4) It search in some specific kind of files (Like .txt, .dat, .php3, etc)
only for 1 extention? then, just pass in that as second parameter to scan_directory_recursively function.


0
 
softpro2kAuthor Commented:
Tanks, Everything is covered except first part of point 1 [ part a ], i.e. and point 5.


1)  a) It displays the exact matches first (The code does not do this),  THEN b) most keyword matches (That your code does).

5) Split the results in several pages (Pagination) based on No. of result I want to show.

Hope to receive your suggestion.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
1) in how far far do EXACT matches differ from most keyword matches?
can you explain that, please

5) let's do that after the other stuff. not everything at once...
0
 
softpro2kAuthor Commented:
Here is an example of exact match. Suppose the Keywords are: SWEET DOG.          

The result should appear something like this

7) Name of the Sweet Dog.php, // This is exact match Sweet+Dog (Full sentence Match)
2) Sweet Berry.php // One Keyword found
5) Don the Dog.htm // One Keyword found
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
so, what does not work with my suggestion?
for 7), the matches is 2, for 2 and 5 the matches is 1. so 7) is (should be) displayed before 2) and 5). as you say, that is the case.
so, you might use the wrong example to show what goes wrong?
because so far, I don't see what goes wrong...
0
 
softpro2kAuthor Commented:
Yes, I used wrong examples. Here is three file names and expected order of showing result :

7) Name of the Sweet Dog.php, // This is Full sentence Match (Sweet+Dog). Two Keyword found.
2) My Dog don't like Sweet Berry.php // Here also Two Keyword found
5) Don the Dog.htm // One Keyword found

7 is showing first instead of 2 though both of them contains  two keyword matches. I think for this we need one extra keyword search function without  using explode.

May be like this:
$words = "Sweet Dog" ' to search its occurrence.

after this search
$words = explode(" ", "Sweet Dog" );
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
change:
 foreach($words as $keyword)
    {
      if (strpos(strtolower($f['path']), strtolower($keyword)) !== FALSE)
      { $count_matches ++; }
    }

into:

 foreach($words as $keyword)
    {
      if (strpos(strtolower($f['path']), strtolower($keyword)) !== FALSE)
      { $count_matches ++; }
    }
  if (strpos(strtolower($f['path']), strtolower($words)) !== FALSE)
  { $count_matches ++; }
 
0
 
softpro2kAuthor Commented:
Thanks, I will let you know after testing.
0
 
softpro2kAuthor Commented:
angelIII,

I have tested the code. But i am getting an error saying something like this "Array to String Conversion Error"

I am a bigginer. So, if possible, post the fresh code after updating / changing required chunks (You can remove the codes that displays unwanted / extra informations that i dont want.

One more thing, Can you please comment out your codes, specially in th second part given below for my understanding and clarification:






  $res = scan_directory_recursively(".");  
  $words = explode(" ", "shadow photo _M _J" );
  $arr_matches = array();
  
 foreach ( $res as $f )  {
    print "{$f['path']}:<br>";
    $count_matches = 0;
    foreach($words as $keyword)
    {if (strpos(strtolower($f['path']), strtolower($keyword)) !== FALSE)
      { $count_matches ++; }  }
 
    if ($count_matches>0)  {
      if (!isset($arr_matches[$count_matches]))             $arr_matches[$count_matches] = array();
      $arr_matches[$count_matches][] = $f;     }  } 
  ksort($arr_matches);
 
  print "<br><hr><br>";
  foreach ($words as $keyword)
  {   print "$keyword<br>";  }
  
  print "<br><hr><br>";
 
  $listing = "";  
  foreach($arr_matches as $count => $results )
  {   $listing_tmp = "$count matches:<br>";
     foreach($results as $file)
    {   $listing_tmp .= "{$file['path']}<br>";  }
 
    $listing = "$listing_tmp<hr>$listing";
  }
  print $listing;
  print "<br><hr><br>";
  print_r($arr_matches);
  print "<br><hr><br>";
  print_r($res);
 ?>

Open in new window

0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
>I have tested the code. But i am getting an error saying something like this "Array to String Conversion Error"

fixed below. the error was in the following line:
    if (strpos(strtolower($f['path']), strtolower($words)) !== FALSE)

as $words is actually an array..
  // fetch the list of files:
  $res = scan_directory_recursively(".");  
 
  // define the array of "words" to check in the file names
  $words = explode(" ", "shadow photo _M _J" );
 
  // prepare the array for the matches
  $arr_matches = array();
  
  // for each file found in the folder, do:
 foreach ( $res as $f )   
 {
    //print "{$f['path']}:<br>";
    //set the match counter to 0: 
    //the counter of words that this file name matches
    $count_matches = 0;
    // for each of the keywords, do:
    foreach($words as $keyword)
    {
      // if the filename contains the keyword, increase the match counter
      if (strpos(strtolower($f['filename']), strtolower($keyword)) !== FALSE)
      { $count_matches ++; }
    }
    // the entire keywords match the filename, increase the match counter by 1
    // that will ensure that such filenames will be displayed first
    if (strpos(strtolower($f['filename']), strtolower(join(' ', $words))) !== FALSE)
    { $count_matches ++; }
  
    // if the number of matches is higher than 0, add this file to the matches array
    if ($count_matches>0)  
    {
      // as we store the matches array indexed by the match counters
      // we have to first ensure that that index position is already initialized
      if (!isset($arr_matches[$count_matches]))             
        $arr_matches[$count_matches] = array();
 
      // add the file object (which is an array with the info) to the match found array
      $arr_matches[$count_matches][] = $f;   
     }
  } 
 
  // sort the array of the matches by the key, which is the match count actually
  ksort($arr_matches);
 
  print "<br><hr><br>";
  foreach ($words as $keyword)
  {   print "$keyword<br>";  }
  
  print "<br><hr><br>";
 
  $listing = "";  
  foreach($arr_matches as $count => $results )
  {   $listing_tmp = "$count matches:<br>";
     foreach($results as $file)
    {   $listing_tmp .= "{$file['path']}<br>";  }
 
    $listing = "$listing_tmp<hr>$listing";
  }
  print $listing;
  print "<br><hr><br>";
  print_r($arr_matches);
  print "<br><hr><br>";
  print_r($res);
 ?>

Open in new window

0
 
softpro2kAuthor Commented:
AngelIII,

Thank you very much for your time in explaining the code. The comments are useful to me.

But the result is not being displayed as desired. I am giving here the example again. The desired order of showing result is like :

7) Name of the Sweet Dog.php, // This is Full sentence Match (Sweet+Dog). Two Keyword found.
2) My Dog don't like Sweet Berry.php // Here also Two Keyword found
5) Don the Dog.htm // One Keyword found

7 is showing first instead of 2 though both of them contains  two keyword matches (AS the first one contains exact keyword match "SWEET+DOG"). Please have a look and test the code.

Regards.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
>But the result is not being displayed as desired. I am giving here the example again.
>The desired order of showing result is like :

well, I created the 3 files you mention, and the output for me is:

3 matches:
./check/Name of the Sweet Dog.php
2 matches:
./check/My Dog don't like Sweet Berry.php
1 matches:
./check/Don the Dog.htm

so, that is what you are requesting?
ie, what is wrong?
0
 
softpro2kAuthor Commented:
Hello angelIII,

Sorry for being late. I was out of Kolkata for sometime.

Yes, I have also tested the code, and it worked great.

Thanks for your cooperation.

Best of Luck.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 9
  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now