match 2 textfiles and output with gaps line per line

hi there

I have 2 text files one is the master url list with say 500 urls then the second file has say X number

is there away to compare the 2 and then output the results to a third file in line with the master file.

for example

Master file

http://www.domain1.com/?=1234
http://domain2.com/?=11243
http://domain3.biz/?=123543
http://www.domain4.com/?=1234
http://www.domain5.com/?=11243
http://www.domain6.net/?=123543

Match file

http://www.domain1.com/?=435456
http://www.domain3.biz/?=45656423    <-------- notice the www as they are not exact to the master ignore them on output to
http://www.domain4.com/?=3453454
http://www.domain6.net/?=3453453

Output file should look like this

http://www.domain1.com/?=435456
                                                      <------ due to the fact that the match does not contain the exact match before the = to the master url it leaves gaps instead

http://www.domain4.com/?=3453454

http://www.domain6.net/?=3453453

BUT you will notice that the match file has different endings so match only upto the = sign and ignore the rest

If you can do this your tip top :0)

regards
playstatAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
Diablo84Connect With a Mentor Commented:
Try this:


<?php
$path = $_SERVER['DOCUMENT_ROOT']."/Playstat/";
$master_file = $path."master.txt";
$match_file = $path."match.txt";
$output_file = $path."output.txt";

$master_array = explode("\n",file_get_contents($master_file));
$match_array = explode("\n",file_get_contents($match_file));

foreach ($master_array as $master_item) {
 $parse = parse_url($master_item);
 $parse = $parse['scheme']."://".$parse['host']."/";
 $check = false;
 foreach($match_array as $match_item) {
  if ($master_item == $match_item) {
   $output[] = $master_item;
   $check = true;
  }
  elseif (strstr($match_item,$parse)) {
   $output[] = $match_item;
   $check = true;
  }
 }
 if ($check == false) $output[] = $master_item;
}

$output = implode("\n",$output);

$handle = fopen($output_file,"w") or die("Cannot open file");
fwrite($handle, $output) or die("Cannot write to file");
fclose($handle);

echo "Analysis Complete!";
?>
0
 
ThGCommented:
I hope they are in some sort of order.. anyway. The following isn't tested, the output is NOT exactly as you want, but you get the idea of the matching process..

$fd = fopen("master.txt", "r");
$fm = fopen("match.txt", "w");

function fetch($f) {
  $tmp = fgets($f);
  if ($tmp === FALSE) return  FALSE;
  $tmp = rtrim($tmp, "\r\n");
  $tmp = preg_replace('/=.*/', '', $tmp); // remove everything after "="
  return $tmp;
}

$next = fetch($fm); // next match
while (($line = fetch($fd)) !== FALSE) {
  if ($line == $next) {
    print $line . "\n";
    $next = fetch($fm);
  }
  else
    print "\n"; // not found, so dont advance match.txt
}
0
 
belcalanCommented:
Hi playstat,

Try this code:
<?php

$file1 = 'file1.txt';
$file2 = 'file2.txt';
$output_file = 'output.txt';

if (file_exists($file1) && file_exists($file2)) {
    $handle = fopen($file1, "rb");
    $f1_cont = '';
    while (!feof($handle)) {
      $f1_cont .= fread($handle, 8192);
    }
    fclose($handle);

    $handle = fopen($file2, "rb");
    $f2_cont = '';
    while (!feof($handle)) {
      $f2_cont .= fread($handle, 8192);
    }
    fclose($handle);

       $f1_array = preg_split('/\n/', $f1_cont);
       $f2_array = preg_split('/\n/', $f2_cont);  
   
    if (count($f1_array) > count($f2_array)) {
        $max = count($f1_array);
    }
    else {
        $max = count($f2_array);
    }
   
    $fho = fopen($output_file, 'w+');
   
    for ($x = 0;$x < $max; $x++) {
        $length = strpos($f1_array[$x], '?');
        if ($length == 0) {
            $length = strlen($f1_array[$x]);
        }
        $string1 = substr($f1_array[$x], 0, $length);

       
        if (check_array_2($string1)) {
            $output .= $string1."\n";
            print $string1."\n";            
        }
        else {
            $output .= "\n";            
        }
    }
    fwrite($fho, $output);
    fclose($fho);
}

function check_array_2($string) {
    global $f2_array;
   
    foreach ($f2_array as $row) {
        $row = rtrim($row);
        $length = strpos($row, '?');
        if ($length == 0) {
            $length = strlen($row);
        }
        $string2 = substr($row, 0, $length);                
       
        if ($string2 == $string) {
            return true;
        }
    }
    return false;
}
?>
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

 
playstatAuthor Commented:
bal it does not work all i get is a blank output file

regards
0
 
playstatAuthor Commented:
nope my bad all it does is chop off everything on and after the = sign

plus the output of the file is not line by line
0
 
playstatAuthor Commented:
plus there are no gaps in the matches that are not found on output
0
 
playstatAuthor Commented:
Tell you waht sratch the gaps just insert the original master urls on the output file if you can please
0
 
Diablo84Commented:
Hi playstat,

I thought i would have a quick look at your question now, and I think this renders the output you are looking for, i tested it with the sample data you provided. You may need to configure the first four variables (hopefully they are self explanitory).


<?php
$path = $_SERVER['DOCUMENT_ROOT']."/";
$master_file = $path."master.txt";
$match_file = $path."match.txt";
$output_file = $path."output.txt";

$master_array = explode("\n",file_get_contents($master_file));
$match_array = explode("\n",file_get_contents($match_file));

foreach($match_array as $match_item) {
 $parse = parse_url($match_item);
 $parse = $parse['scheme']."://".$parse['host']."/";
 foreach ($master_array as $master_item) if (strstr($master_item,$parse)) $output[] = $match_item;
}

$output = implode("\n",$output);

$handle = fopen($output_file,"w") or die("Cannot open file");
fwrite($handle, $output) or die("Cannot write to file");
fclose($handle);

echo "Analysis Complete!";
?>


If there are any problems or the output isn't quite right post back and il check in the morning.

Best Wishes

|)iablo
0
 
Diablo84Commented:
Sorry, i forgot about the gaps, a little modification and....

<?php
$path = $_SERVER['DOCUMENT_ROOT']."/Playstat/";
$master_file = $path."master.txt";
$match_file = $path."match.txt";
$output_file = $path."output.txt";

$master_array = explode("\n",file_get_contents($master_file));
$match_array = explode("\n",file_get_contents($match_file));

foreach($match_array as $match_item) {
 $parse = parse_url($match_item);
 $parse = $parse['scheme']."://".$parse['host']."/";
 $check = false;
 foreach ($master_array as $master_item) {
  if (strstr($master_item,$parse)) {
   $output[] = $match_item;
   $check = true;
  }
 }
 if ($check == false) $output[] = "\n";
}

$output = implode("\n",$output);

$handle = fopen($output_file,"w") or die("Cannot open file");
fwrite($handle, $output) or die("Cannot write to file");
fclose($handle);

echo "Analysis Complete!";
?>
0
 
playstatAuthor Commented:
Diablo can you change it so say the url in the match file is not exact to upto the = in the master file insert the master URL instead of gaps to the output file

Master file

http://www.domain1.com/?=1234
http://domain2.com/?=11243
http://domain3.biz/?=123543
http://www.domain4.com/?=1234
http://www.domain5.com/?=11243
http://www.domain6.net/?=123543

Match file

http://www.domain1.com/?=435456
http://www.domain3.biz/?=45656423    <-------- notice the www as they are not exact to the master ignore them on output to
http://www.domain4.com/?=3453454
http://www.domain6.net/?=3453453

Output file should look like this

http://www.domain1.com/?=435456       <-- matched to master successful
http://domain2.com/?=11243                  <---- master file url as there was no exact to the match file
http://domain3.biz/?=123543                  <----- master file url
http://www.domain4.com/?=3453454      <-- matched to master successful
http://www.domain5.com/?=11243         <------master file url
http://www.domain6.net/?=3453453       <-- matched to master successful


regards

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.