create array from text files in a directory, regex php

I have a directory on a server with some text files containing data.  I need to check if in the file text there exists the word "planner_name", and create an array of the planner names that are inside the text files, essentially linking the planner name to the respective text file.


So if text file p1944.1.txt contains the text

planner_name|s:12:"Matt Robbins";

and text file p2010.3.txt does not contain the text planner_name

It will create an array like:

[0] => Array
        (
            [p1944.1.txt] => Matt Robbins
            [p2010.3.txt] =>
            [p2414.4.txt] => Other planner name
        )


Here is the code I have, but it relies on a massive string, which is not reliable if a text file does not contain the "planner_name"
<?php
 


                              //need trailing slash
                              $pathToDirectory = "path/to/dir/";
                 
                 
                              
                        if ($handle = opendir('path/to/dir')) {
                      //echo "Directory handle: $handle\n";
                      //echo "Files:\n";
                  
                      /* This is the correct way to loop over the directory. */
                      while (false !== ($file = readdir($handle))) {
                           
                          //$file_array = array($file);

                            $list_of_files .= $file;
                              
                              $file_array = array($list_of_files);

            
               
                      }
                               
                  
                      closedir($handle);
                        }      
                              
                              
                                    //get all filenames into uniform string
                                    
                                    
                              $files_as_string = implode("", $file_array);
                     
                                          //replacing new lines of the massive array in the right order
                                          
                                          
                                          $remove_this   = ".txt";
                                          $and_put_this_in = '.txt55';
                                          
                                          // Processesfirst so all line returns are removed.
                                          $fileText_almost_there = str_replace($remove_this, $and_put_this_in, $files_as_string);
                        
                        
                                          
                                          $remove_this_from_fileText_almost_there   = "..";
                                          $and_put_this_in_from_fileText_almost_there = '';
                                          
                                          // Processesfirst so all line returns are removed.
                                          $fileText_pass2 = str_replace($remove_this_from_fileText_almost_there, $and_put_this_in_from_fileText_almost_there, $fileText_almost_there);
                        
                                          
                                          
                                          $remove_this_from_pass2   = ".txt55.";
                                          $and_put_this_in_from_pass2 = '.txt55';
                                          
                                          // Processesfirst so all line returns are removed.
                                          $fileText_pass3 = str_replace($remove_this_from_pass2, $and_put_this_in_from_pass2, $fileText_pass2);
                                          
                                    //done w getting into string      
                                    
                                    
                                    $file_array_uniform = explode("55", $fileText_pass3);
                                    
                                    
                                    
                                    //filter out empty values - remove empty elements
                                    $file_array_uniform_filtered = array_filter($file_array_uniform);
                                    
                                    
                                    foreach ($file_array_uniform_filtered as &$file_number_extracted) {
                                       echo " $file_number_extracted <br>";
                                       
                                       $each_file_in_directory  = "$pathToDirectory"."/"."$file_number_extracted";
                                       
                                        $string_of_all_files .= file_get_contents($each_file_in_directory);
                                       
            
                                       
                                       
                                       
                                    
                                       
                                       
                                    }
                                                                  
                        
                                                         
                                          //replacing new lines of the massive array in the right order
                                          
                                          
                                          $newcflines   = "\r";
                                          $replace_newcflines = '';
                                          
                                          // Processes \r\n's first so all line returns are removed.
                                          $string_of_all_files_almost_there = str_replace($newcflines, $replace_newcflines, $string_of_all_files);
                                          
                                          
                                          $newlines   = "\n";
                                          $replace_newlines = '';
                                          
                                          $string_of_all_files_2 = str_replace($newlines, $replace_newlines, $string_of_all_files_almost_there);
                                          
                                          
                                          
                                          $planner_name   = "planner_name|s:";
                                          $planner_name_separator = 'planner_name_separator_f_f_f';
                                          
                                          $string_of_all_files_3 = str_replace($planner_name, $planner_name_separator, $string_of_all_files_2);

                                          
                                          
                                          $filenumber_search   = "filenumber_2|s:";
                                          $filenumber_search_separator = 'filenumber_search_separator_f_f_f';
                                          
                                          $string_of_all_files_4 = str_replace($filenumber_search, $filenumber_search_separator, $string_of_all_files_3);

                                          
                                          $filenumber_search_regex = '#filenumber_search_separator_f_f_f#';
                                          $find_planner_pattern = '#planner_name_separator_f_f_f#';
                                          
                                          
                                          preg_match($find_planner_pattern, substr($string_of_all_files_4,20), $matches_of_planner_name_separator_f_f_f, PREG_OFFSET_CAPTURE);
//                                           
                                          
                                          
                                          //preg_match_all("$find_planner_pattern","$string_of_all_files_4",$matches_of_planner_name_separator_f_f_f,PREG_SET_ORDER);
                                              
                                              
                                          echo $matches_of_planner_name_separator_f_f_f[0][0] . ", " . $matches_of_planner_name_separator_f_f_f[0][1] . "\n";
                                          
                                          echo $matches_of_planner_name_separator_f_f_f[1][0] . ", " . $matches_of_planner_name_separator_f_f_f[1][1] . "\n";
                                          
                                          
                                          
                              echo "<br>array:<br>";
                                          print_r($matches_of_planner_name_separator_f_f_f);
                                                                                    
                                          
                                          
                                          //$array_of_massive_string_split = split("s:", $string_of_all_files_2);
                                          
                                          //$array_of_massive_string_split = split("planner_name", $string_of_all_files_2);
                                    
                                    
            //$key_of_planner_name_in_array_of_massive_string_split = array_search('planner_name', $array); // $key = 2;                        
                                    
                                    
                                    
                                    
                                    
                                    
//                         echo "file STRING:<br>";
//                               
                              echo $string_of_all_files_4;

; ?>


FYI, the text files look like this :

----
....filenumber_2|s:7:"p1944.1";distr|s:7:"
 
 ";other_var|s:4:"Home";.....ator";planner_name|s:12:"Matt Robbins ";moretext.......

----

mattpierceyAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Beverley PortlockCommented:
Try the code below which will produce an array of file objects each of which can handle mutiple planner names. You can easily run a foreach over the object array and just pull out records where the array count is non-zero.

You will need to set the variable $pathToDirectory to the correct folder and make sure it has a trailing slash

<?php

class FileList {

     public $filename;
     public $plannerArray;


     function __construct( $f ) {
          $this->filename = $f;
          $this->plannerArray = array();
     }


     function add( $name ) {
          $this->plannerArray [] = $name;
     }

}





     $fileArray = array();

     $pathToDirectory = "/path/to/dir/with/trailing/slash/";
     $pattern    = '#planner_name\|[^"]+"([^"]+)#s';

     // Get list of files in directory
     //
     if ( $fileList = opendir( $pathToDirectory ) ) {
          while ( $aFile = readdir( $fileList ) ) {

               // Skip directory entries - files only
               //
               if ( filetype( $pathToDirectory . $aFile ) != "file" )
                    continue;

               // Read the file into a string and create a fileObject to store the details
               //
               $fileObject = new FileList( $aFile );
               $fileText = file_get_contents( $pathToDirectory . $aFile );

               // Search for test string
               //
               if ( preg_match_all( $pattern, $fileText, $matches ) > 0 ) {
                    foreach( $matches[1] as $aPlannerName )
                         $fileObject->add( $aPlannerName );
               }

               // Put the file objects into the filearray
               //
               $fileArray [] = $fileObject;

          }

          closedir( $fileList );
     }
     else
          echo "No files to process";



print_r( $fileArray );


// List only those with planner names
//
foreach( $fileArray as $aFile )
     if ( count( $aFile->plannerArray ) > 0 ) 
          echo "{$aFile->filename} has planners in it<br/>";

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Ray PaseurCommented:
This sort of thing -- planner_name|s:12:"Matt Robbins"; -- appears to be a serialized array or something like that.  Is that the case?
0
Ray PaseurCommented:
This seemed to work OK for me.  Install it and try it on your system and let me know if you get what you expect.
<?php // RAY_temp_mattpiercey.php
error_reporting(E_ALL);
echo "<pre>";

// THE PATH TO OUR CURRENT WORKING DIRECTORY
$dir = getcwd();
$str = 'planner_name';

// AN OPTIONAL DIRECTORY NAME AND SEARCH STRING FROM THE URL ARGUMENT
if (isset($_GET["d"])) $dir = $_GET["d"];
if (isset($_GET["s"])) $str = $_GET["s"];

// APPEND A SLASH IF NEEDED
if ($dir[strlen($dir)-1] != DIRECTORY_SEPARATOR) $dir .= DIRECTORY_SEPARATOR;


// GET THE LIST OF FILES FROM THE DIRECTORY
if ($dh = opendir($dir))
{
    $filenames = array();
    while (($filename = readdir($dh)) !== FALSE)
    {
        $file_ext = strtoupper(end(explode('.', $filename)));
        if ($file_ext != 'PHP') continue;
        $filenames[$filename] = $filename;
    }
    closedir($dh);
}
else
{
    die("UNABLE TO READ $dir");
}


// PROCESS THE FILES
foreach ($filenames as $filename)
{
    $txt = file_get_contents($filename);

    // IS THE STRING PRESENT?
    if (strpos($txt, $str) === FALSE)
    {
        $filenames[$filename] = NULL;
    }

    // TRY TO RECOVER THE NAME
    else
    {
        $arr = explode($str, $txt);
        $arr = explode('"', $arr[1]);
        $filenames[$filename] = $arr[1];
    }
}

// SHOW THE WORK PRODUCT
ksort($filenames);
print_r($filenames);

Open in new window

0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

mattpierceyAuthor Commented:
Ray, to answer your question, the arrays in the text files are not serialized.  
The text files did, however get to where they are with a session_encode:

fputs($sessionfile, session_encode( ) );


Upon running your code [a php file named search_for_planner_in_fold.php] in the directory where the text files reside, I get this printed:

Array
(
    [search_for_planner_in_fold.php] => d
)

My concern with your proposed solution is the possibility of a file lacking "planner_name" being embedded into the array.
0
Beverley PortlockCommented:
"My concern with your proposed solution is the possibility of a file lacking "planner_name" being embedded into the array."

The solution I posted (at the top) will cope with files missing a 'planner name'
0
Ray PaseurCommented:
It would appear that you put my script into the folder where you had the ***.txt files.  I had to test it on files named ***.php and neglected to correct the file extension.  Please make the appropriate change to line 24 and run it again, thanks.
0
mattpierceyAuthor Commented:
Thanks bportlock, works great. And if I wanted to echo the name of the planner in your last line:

foreach( $fileArray as $aFile )
     if ( count( $aFile->plannerArray ) > 0 )
          echo "The File number {$aFile->filename} has the planner: ??? in it<br/>";

any suggestions?
0
Beverley PortlockCommented:
The planner's name is in an array in case there are more than one of them so one filename can have many planner names. Thus

echo "The File number {$aFile->filename} has the planner: ??? in it<br/>";

could return multiple results. This is where the class comes to the rescue. We can add a method that returns either one name or multiple ones. We extend the class like this (UNTESTED)


class FileList {

     public $filename;
     public $plannerArray;


     function __construct( $f ) {
          $this->filename = $f;
          $this->plannerArray = array();
     }


     function add( $name ) {
          $this->plannerArray [] = $name;
     }


     function getPlanners() {
          return implode(", ", $this->plannerArray );
     }

}



Then your last line becomes

echo "The File number {$aFile->filename} has the planner: {$aFile->getPlanners()} in it<br/>";
0
mattpierceyAuthor Commented:
Your suggestion worked perfectly, again. Thank you bportlock for the technical skill and speedy response.

- matt
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.