Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Search and replace urls with email addresses

Posted on 2004-09-04
5
Medium Priority
?
194 Views
Last Modified: 2006-11-17
hi there

I have a list of urls in one text file and then some emails in another.

What im after is a way of comparing and finding the domain name of an email with the list of the urls in a text file

for example

url list

http://www.thisisadomain.com/?=23453
http://www.acooldomain.com/?=34234               
http://noemailmatches.com/?=23454         
hhp://ifonlyihadasportscar.com?=34234

then in the email list you would have something like this that matches that domain

admin@thisisadomain.com
support@acooldomain.com
contact@ifonlyihadasportscar.com


now if there are missing emails or it cannot find the right address for that matched domain just copy the original URL in its place.

So the final master file would be a combo of matched emails to domain and the urls that have no matches.

admin@thisisadomain.com
support@acooldomain.com
http://noemailmatches.com/?=23454 <-------------------- no matches from email list
contact@ifonlyihadasportscar.com

can you make sure the output is line by line

Best regards

0
Comment
Question by:playstat
  • 3
5 Comments
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 11981393
$urls = file('urllist.txt');
$emails = file('emails.txt');
$output = fopen('out.txt', 'w');
foreach($urls as $url) {
      if (preg_match('/http[s]?:\/\/(www\.)?(.*)\//', $url, $matches)) {
            $domain = $matches[2];
            $out = '';
            foreach($emails as $email) {
                  if (strpos($domain, $email) !== false) {
                        $out = $email;
                        break; //stop looking
                  }
            }
            if ($out = '') {
                  fwrite($output, "$url\n");
            } else {
                  fwrite($output, "$out\n");
            }
      }
}
fclose($output);

Not tested, other than the regex, but that should be roughly what you need. Not the most efficient, but shouldn't be too slow.
0
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 11981399
oops:

               if (strpos($domain, $email) !== false) {

should be:

               if (strpos($email, $domain) !== false) {
0
 

Author Comment

by:playstat
ID: 11984297
it doesnt work
0
 
LVL 25

Accepted Solution

by:
Marcus Bointon earned 2000 total points
ID: 11988357
This definitely works (original missed one = char):

<?php
$urls = file('urllist.txt');
$emails = file('emails.txt');
$output = fopen('out.txt', 'wb');
foreach($urls as $url) {
    if (preg_match('/http[s]?:\/\/(www\.)?(.*)\//', $url, $matches)) {
        $domain = $matches[2];
        $out = '';
        foreach($emails as $email) {
            if (strpos($email, $domain) !== false) {
                $out = $email;
                break; //stop looking
            }
        }
        if ($out == '') {
            fwrite($output, "$url");
        } else {
            fwrite($output, "$out");
        }
    }
}
fclose($output);
?>
0
 
LVL 2

Expert Comment

by:Rajkumar_G
ID: 11994120

  $data="http://www.thisisadomain.com/?=23453";
     // get host name from URL
  preg_match("/^(http:\/\/)?([^\/]+)/i",
  "http://www.thisisadomain.com/?=23453", $matches);
  $host = $matches[2];


    // get last two segments of host name
  preg_match("/[^\.\/]+\.[^\.\/]+$/",$host,$matches);
  echo "<br>domain name from URL : ".$matches[0]."\n";

  $email = "admin@thisisadomain.com";
  $str_pos=strpos($email,'@');
  $domain_email=substr($email,$str_pos+1);
  echo "<br><br><br>Domain Email  : ".$domain_email;



  if(strcmp($domain_email,$matches[0])==0)
   {
     echo "<br>Matched";
      $data = preg_replace('|\\b(http://[^\s)<]+)|',$email, $data);
     echo "<br><br><br>Result :".$data;
   }
  else
   {
     echo "<br> Not Matched";
   }


I have done it with an example, not the full code, but the main part of it. I think u know how to read from and write in to a text file. Customize this code according to your need.
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
3 proven steps to speed up Magento powered sites. The article focus is on optimizing time to first byte (TTFB), full page caching and configuring server for optimal performance.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
Suggested Courses

782 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question