• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 251
  • Last Modified:

Is there a script I can run that will pull phone numbers and emails from a website?

I am creating a list of phone numbers and emails from this site: http://www.golfnationwide.com/US-Golf-Course-List-And-Directory.aspx.  Is there a script I can run that would essentially crawl this site and return all of the email and phone numbers?
0
tprofits
Asked:
tprofits
1 Solution
 
leakim971PluritechnicianCommented:
Not really...
If you've some $$$ go on one of the following site :
https://www.odesk.com/
http://www.rent-acoder.com/
http://www.freelancer.com/
0
 
Big MontySenior Web Developer / CEO of ExchangeTree.org Commented:
if you're interested in hiring someone to do the work, I'd be willing to offer my services. I'll charge less as I'm trying to build up my portfolio.
0
 
tprofitsAuthor Commented:
The_Big_Daddy - Contact me at tjback@gmail.com.
0
 
Ray PaseurCommented:
You will need a list of all the golf clubs.  They are listed by state.  So you will first want to scrape this page:
http://www.golfnationwide.com/Default.aspx

Then for each state, you will find a list that looks like this.  You will need to collect the URLS for each of the club links.
http://www.golfnationwide.com/Golf-Courses-By-State/Virginia-Golf-Courses__VA.aspx

Once you have those URLs, they will point to pages like this:
http://www.golfnationwide.com/Golf-Courses-By-State/Virginia/Belle-Haven-Country-Club___32739.aspx

The code snippet shows how to isolate the information you want for each of the clubs.

Best of luck with your project, ~Ray
<?php // RAY_temp_tprofits.php
error_reporting(E_ALL);


// ONE OF THE PAGES TO SEARCH
$url = 'http://www.golfnationwide.com/Golf-Courses-By-State/Virginia/Belle-Haven-Country-Club___32739.aspx';
$htm = file_get_contents($url);

// THE THINGS WE DO NOT NEED
$junk = array
( '<span id="Block">'
, '<span id="CourseImage">'
)
;

// THE THINGS WE WANT TO FIND
$things = array
( 'ctl00_MainContentPlaceholder_CourseNameLabel' => 'NAME:  '
, 'ctl00_MainContentPlaceholder_EmailLabel'      => 'EMAIL: '
, 'ctl00_MainContentPlaceholder_PhoneLabel'      => 'PHONE: '
)
;

// AVOID SEARCHING THROUGH ALL THE NOISE
$arr = explode($junk[0], $htm);
$arr = explode($junk[1], $arr[1]);
$htm = $arr[0];

// SEARCH THE STRINGS
foreach ($things as $target => $thing)
{
    // CONSTRUCT A REGULAR EXPRESSION
    $regex
    = '#'        // REGEX DELIMITER
    . '\<span'   // OPEN-SPAN TAG
    . '.*?'      // ANYTHING OR NOTHING
    . $target    // THE SEARCH STRING
    . '.*?'      // ANYTHING OR NOTHING
    . '\>'       // END OF THE OPEN-SPAN TAG
    . '(.*?)'    // GROUP OF CHARACTERS
    . '\</span'  // CLOSE-SPAN TAG
    . '#'        // REGEX DELIMITER
    . 'i'        // CASE-INSENSITIVE
    ;

    // SEARCH THE HTML FRAGMENT
    preg_match($regex, $htm, $mat);
    $new[$thing] = $mat[1];
}

// SHOW THE WORK PRODUCT
echo "<pre>";
print_r($new);

Open in new window

0

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now