reorder matched text file from master textfile part 2

hi there

http://www.experts-exchange.com/Web/Web_Languages/PHP/Q_21185604.html       

Answer PART B if  question PART A works CORRECTLY.

Would it be possible to correct a url IF the second text file has a wrong url

master

http://www.domain1.com/master?=345
http://domain2.com/master?=hdf
http://domain3.com/correct?=fds
http://domain4.com/?=rwe
http://domain5.com=234
http://www.domain6.com=342

match file with right domain with wrong url format but right ID at end.

http://www.domain1.com/wrongtype?=234
http://domain3.com/correct?=365
http://domain4.com/oopps?=856
http://domain2.com/master?=465

CSV output with ,

Column A                                                             Column B

http://www.domain1.com/master?=345                  http://www.domain1.com/master?=234
http://domain2.com/master?=hdf                            http://domain2.com/master?=465
http://domain3.com/correct?=fds                            http://domain3.com/correct?=365
http://domain4.com/?=rwe                                    http://www.domain4=856
http://domain5.com=234                                       http://domain5.com=234
http://www.domain6.com=342                               http://www.domain6.com=342

thanks



playstatAsked:
Who is Participating?
 
RoonaanConnect With a Mentor Commented:
The first example actually does the following:
Match files which domain match, url do not necesarily match, and id matches:

<?php
$aMaster = file('master.txt');
$aSecond = file('second.txt');

$aMatch = array();

foreach($aSecond as $iIndex => $sUrl)
{
  //replace http://www.domain1.com/wrongtype?=234 into http://www.domain1.com=234
  $sUrl = trim($sUrl);
  if(($iEq = strpos($sUrl, "=")) !== false && ($iSlash=strpos($sUrl,'/',7)) !== false)
  {
    $sUrlShort = substr($sUrl,0,$iSlash).substr($sUrl,$iEq);
  }
  else
    $sUrlShort = $sUrl;
 
  $aSecond[$iIndex] = array('u' => trim($sUrl), 's' => trim($sUrlShort));
}

foreach($aMaster as $sUrl)
{
  $sUrl = trim($sUrl);
  if(($iEq = strpos($sUrl, "=")) !== false && ($iSlash=strpos($sUrl,'/',7)) !== false)
  {
    $sUrlShort = substr($sUrl,0,$iSlash).substr($sUrl,$iEq);
  }
  else
    $sUrlShort = $sUrl;

  foreach($aSecond as $iSIndex => $aUrl)
  {
    if(strpos($sUrlShort, $aUrl['s']) !== false)
    {
      $aMatch[] = array($sUrl, $aUrl['u']);
      unset($aSecond[$iSIndex]);
      break;
    }
  }
}

Output matches

foreach($aMatch as $iIndex => $aRow)
{
  if($iIndex > 0) echo "\n";
  echo $aRow[0].','.$aRow[1];
}
?>

This example shows matching of Domain and Url, but not ID: (as your column A, column B example did)

foreach($aSecond as $iIndex => $sUrl)
{
  $sUrl = trim($sUrl);

  if(($iEq = strpos($sUrl, "=")) !== false)
  {
    $sUrlShort = substr($sUrl,0,$iEq);
  }
  else
    $sUrlShort = $sUrl;
 
  $aSecond[$iIndex] = array('u' => trim($sUrl), 's' => trim($sUrlShort));
}

foreach($aMaster as $sUrl)
{
  $sUrl = trim($sUrl);
  if(($iEq = strpos($sUrl, "=")) !== false)
  {
    $sUrlShort = substr($sUrl,0,$iEq);
  }
  else
    $sUrlShort = $sUrl;

  foreach($aSecond as $iSIndex =>  $aUrl)
  {
    if(strpos($sUrlShort, $aUrl['s']) !== false)
    {
      $aMatch[] = array($sUrl, $aUrl['u']);
      unset($aSecond[$iSIndex]);
      break;
    }
  }
}

foreach($aMatch as $iIndex => $aRow)
{
  if($iIndex > 0) echo "\n";
  echo $aRow[0].','.$aRow[1];
}


Regards

-r-
0
 
playstatAuthor Commented:
hi there this is the faster version

would it be possible to add yes or no to column a if match/nomatch and then url correction after the = in column E if no correction insert master url instead

<?php

$master_file = "list.txt";
$match_file = "match.txt";
$output_file = "file.csv";

$master_array = preg_split('/[\\r\\n]+/', trim(file_get_contents($master_file)));
$match_array = preg_split('/[\\r\\n]+/', trim(file_get_contents($match_file)));
$second_array = array();

// apply trim() to remove blank chars
foreach ($master_array as $key=>$master_item) $master_array[$key] = trim($master_item);
foreach ($match_array as $key=>$match_item) $match_array[$key] = trim($match_item);

// analyse
foreach($master_array as $key=>$master_item) {
 $check = false;
 $master_parse = explode('=', $master_item);
 unset($second_item);
 foreach ($match_array as $match_item) {
  $match_parse = explode('=', $match_item);
  if ($master_parse[0] == $match_parse[0]) {
   $second_item = $match_item;
   $check = true;
   break;
  }
 }
 $second_array[$key] = isset($second_item) ? $second_item : $master_item;
 if ($check == false)
      $non_match[$key] = $master_item;
}

$output = array();
foreach($master_array as $key=>$master_item) {
 $output[] = "{$master_item},{$second_array[$key]},{$non_match[$key]},// call it nonmatch choice or something";
}

// leave non match as is but with an another coloumn at the end with nonmatch yes or no. If match insert yes if nomatch insert no. put in column D

$output = implode("\n",$output);

$handle = fopen($output_file,"wb") or die("Cannot open file");
fwrite($handle, $output) or die("Cannot write to file");
fclose($handle);

$non_match = implode("\n",$non_match);
$handle = fopen("nomatches.txt","w+") or die("Cannot open file");
fwrite($handle, $non_match) or die("Cannot write to file");
fclose($handle);

echo "Analysis Complete!";
?>
0
 
playstatAuthor Commented:
correction add choice yes or no match to column D in csv file

$output[] = "{$master_item},{$second_array[$key]},{$non_match[$key]},// call it nonmatch choice or something";
$output[] = A,B,C,D,corrected url from master if match file has same domain but different page ID";
0
All Courses

From novice to tech pro — start learning today.