• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 204
  • Last Modified:

Images Indexed In Search Engines

Hello all!

Is it possible to create a script that checks and displays how many images are indexed for a list of URLs (domains)?

So if I had a list like:

http://url1
http://url2
http://url3

The script would check both "Yahoo image search" and "Google image search" and return how many images each domain has indexed - like:

url1= 12 Yahoo 10 Google
url2= 2 Yahoo 5 Google
url3= 15 Yahoo 12 Google


Is this at all possible??

Thank you~!
0
laura_sky
Asked:
laura_sky
  • 6
  • 4
1 Solution
 
KennyTMCommented:
function get_google_search_count ($q) {
  $s = file_get_contents('http://images.google.com/images?q=' . urlencode($q));
  $b = preg_match ('|Results <b>1</b> - <b>\d+</b> of(?: about)? <b>([\d,]+)</b>|', $s, $a);
  return $b ? intval(str_replace(',', '', $a[1])) : 0;
}

echo get_google_search_count('Radiation'); // outputs 266000
0
 
KennyTMCommented:
To get the count for a specific site, use

echo get_google_search_count('site:www.experts-exchange.com'); // outputs 103
0
 
KennyTMCommented:
function get_image_search_count ($q, $engine) {
$engine_url = Array(
  'google' => 'http://images.google.com/images?q=',
  'yahoo' => 'http://images.search.yahoo.com/search/images?p=',
  'msn' => 'http://search.msn.com/images/results.aspx?q='
);
$engine_regex = Array (
  'google' => '#Results <b>1</b> - <b>\d+</b> of(?: about)? <b>([\d,]+)</b>#',
  'yahoo' => '#Results <strong>1 - \d+</strong> of about <strong>([\d,]+)</strong>#',
  'msn' => '#<h5>Page 1 of ([\d,]+) results containing#'
);
  $e = strtolower($engine);
  $s = file_get_contents( $engine_url[$e] . urlencode($q) );
  $b = preg_match ($engine_regex[$e], $s, $a);
  return $b ? intval(str_replace(',', '', $a[1])) : 0;
}

echo get_image_search_count('Hello world!', 'Yahoo') . '<br />';  // 73331
echo get_image_search_count('Hello world!', 'Google') . '<br />'; // 73800
echo get_image_search_count('Hello world!', 'MSN') . '<br />';    // 2176
0
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

 
laura_skyAuthor Commented:
Thanks KennyTM . . .

If I wanted to use a textarea box to enter a list of URLs, how would I do this?

Like:

<form action="index.php>
<textarea>
http://url1
http://url2
http://url2
</textarea>
<input type="submit" value="Check">
</form>

Is there a way to do this, so that it checks each URL from the list?

Thanks~!
0
 
KennyTMCommented:
The problem is that Yahoo! will not search anything if you enter only the domain alone. (e.g., in http://images.search.yahoo.com/search/images/advanced?ei=UTF-8 , enter some URL in "only search in this domain/site:" and leave all other things blank, then press "Yahoo! Search"... you'd end up being in the homepage of image search.)
0
 
laura_skyAuthor Commented:
I see what you're saying...

In that case, how about using GET, and appending the domain to the end of:

http://images.search.yahoo.com/search/images?ei=UTF-8&x=wrt&p=experts-exchange.com

Which returns 70 results...

If this still won't work, is there a way to just check Google image search?

Also, is it possible to check a list of URLs enetered into a textarea field?

Thanks~!
0
 
KennyTMCommented:
If you just want to check the Google image search, just write

get_image_search_count('site:www.experts-exchange.com', 'Google');

----

For your suggestion, it can actually be implemented with

get_image_search_count('experts-exchange.com', 'Yahoo');

However, as you can see from the result, not everything is really related by experts-exchange.com. For example, the first image "headleft.jpg" has absolutely nothing to do with experts-exchange.com (although the URL www.force137.com/forum/showthread.php?t=7593 has the string "experts-exchange.com"). If this is what you want, just use the code above. Otherwise, I can't help with Yahoo!.
0
 
laura_skyAuthor Commented:
Ok, I see...

But how would I check a list of URLs at the same time without entering them each individually?

With:

get_image_search_count('site:www.experts-exchange.com', 'Google');

I would have to type in the URL for each site, which would be very time consuming, and would defeat the purpose of creating a script in the first place. :-)

So can a textarea field be used?

Thanks~!
0
 
KennyTMCommented:
sure. Suppose $a holds the content of the textarea. Then you explode() the string by

$urls = explode("\r\n", $a);

Then call get_image_search_count() for every URL:

$counts = Array();
foreach ($urls as $k => $url) {
  $counts[$k] = get_image_search_count("site:$url", 'Google');
}
print_r ($counts);
0
 
laura_skyAuthor Commented:
I'm sorry, I've been so busy recently that I totally forgot about this question!

I have not abandoned this question. In fact, I still require help with this question, that is if KennyTM still wants to assist me?

If not, I will gladly close this question and award KennyTM 400 points.

Please let me know!

Thanks again!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

  • 6
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now