Solved

Search and replace urls with email addresses

Posted on 2004-09-04
5
184 Views
Last Modified: 2006-11-17
hi there

I have a list of urls in one text file and then some emails in another.

What im after is a way of comparing and finding the domain name of an email with the list of the urls in a text file

for example

url list

http://www.thisisadomain.com/?=23453
http://www.acooldomain.com/?=34234               
http://noemailmatches.com/?=23454         
hhp://ifonlyihadasportscar.com?=34234

then in the email list you would have something like this that matches that domain

admin@thisisadomain.com
support@acooldomain.com
contact@ifonlyihadasportscar.com


now if there are missing emails or it cannot find the right address for that matched domain just copy the original URL in its place.

So the final master file would be a combo of matched emails to domain and the urls that have no matches.

admin@thisisadomain.com
support@acooldomain.com
http://noemailmatches.com/?=23454 <-------------------- no matches from email list
contact@ifonlyihadasportscar.com

can you make sure the output is line by line

Best regards

0
Comment
Question by:playstat
  • 3
5 Comments
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 11981393
$urls = file('urllist.txt');
$emails = file('emails.txt');
$output = fopen('out.txt', 'w');
foreach($urls as $url) {
      if (preg_match('/http[s]?:\/\/(www\.)?(.*)\//', $url, $matches)) {
            $domain = $matches[2];
            $out = '';
            foreach($emails as $email) {
                  if (strpos($domain, $email) !== false) {
                        $out = $email;
                        break; //stop looking
                  }
            }
            if ($out = '') {
                  fwrite($output, "$url\n");
            } else {
                  fwrite($output, "$out\n");
            }
      }
}
fclose($output);

Not tested, other than the regex, but that should be roughly what you need. Not the most efficient, but shouldn't be too slow.
0
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 11981399
oops:

               if (strpos($domain, $email) !== false) {

should be:

               if (strpos($email, $domain) !== false) {
0
 

Author Comment

by:playstat
ID: 11984297
it doesnt work
0
 
LVL 25

Accepted Solution

by:
Marcus Bointon earned 500 total points
ID: 11988357
This definitely works (original missed one = char):

<?php
$urls = file('urllist.txt');
$emails = file('emails.txt');
$output = fopen('out.txt', 'wb');
foreach($urls as $url) {
    if (preg_match('/http[s]?:\/\/(www\.)?(.*)\//', $url, $matches)) {
        $domain = $matches[2];
        $out = '';
        foreach($emails as $email) {
            if (strpos($email, $domain) !== false) {
                $out = $email;
                break; //stop looking
            }
        }
        if ($out == '') {
            fwrite($output, "$url");
        } else {
            fwrite($output, "$out");
        }
    }
}
fclose($output);
?>
0
 
LVL 2

Expert Comment

by:Rajkumar_G
ID: 11994120

  $data="http://www.thisisadomain.com/?=23453";
     // get host name from URL
  preg_match("/^(http:\/\/)?([^\/]+)/i",
  "http://www.thisisadomain.com/?=23453", $matches);
  $host = $matches[2];


    // get last two segments of host name
  preg_match("/[^\.\/]+\.[^\.\/]+$/",$host,$matches);
  echo "<br>domain name from URL : ".$matches[0]."\n";

  $email = "admin@thisisadomain.com";
  $str_pos=strpos($email,'@');
  $domain_email=substr($email,$str_pos+1);
  echo "<br><br><br>Domain Email  : ".$domain_email;



  if(strcmp($domain_email,$matches[0])==0)
   {
     echo "<br>Matched";
      $data = preg_replace('|\\b(http://[^\s)<]+)|',$email, $data);
     echo "<br><br><br>Result :".$data;
   }
  else
   {
     echo "<br> Not Matched";
   }


I have done it with an example, not the full code, but the main part of it. I think u know how to read from and write in to a text file. Customize this code according to your need.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Doing something wrong with this PDO Select Statement... 3 19
Dynamic Dropdowns 15 33
Extracting store locations from Google maps or site 2 23
Row insertion failed. Array 5 48
Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question