Solved

PHP: How can I detect if a user came from a Search Engine

Posted on 2010-09-03
4
470 Views
Last Modified: 2012-05-10
I have a site where I award user's Viz (currency) for giving users their referral url and when a person being referred clicks it, they person referring them gets the Viz.  However, a problem is that Google has indexed some of those url's with the referral code in them so what I want to do is detect if the referred user is coming from a Search engine and not award the Viz.
0
Comment
Question by:davideo7
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 2

Assisted Solution

by:Gatherer_Hade
Gatherer_Hade earned 100 total points
ID: 33598752
Check the contents of the variable
$_SERVER['HTTP_REFERER']

You can use strpos() to look for google, yahoo, bing, etc within this variable.
0
 
LVL 83

Assisted Solution

by:Dave Baldwin
Dave Baldwin earned 100 total points
ID: 33598787
Look at the HTTP_REFERER string, "$_SERVER['HTTP_REFERER'];" and User Agent string, "$_SERVER['HTTP_USER_AGENT'];".  There are a lot of search engines out there.  You probably want to focus on the 'big' ones like Google, Bing, Yahoo, and Baidu.
0
 
LVL 12

Assisted Solution

by:GMGenius
GMGenius earned 100 total points
ID: 33598799
The problem you can get using the HTTP_REFERER is that some antivus software will block the value from the browser, the HTTP_REFERER is client side.
Your best option is to pick up a passed variable ?ref=12345 in the url and pull it in your PHP code
eg
$refer = @$_GET["ref"];
but this will require the URL to contact the correct referal code
0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 200 total points
ID: 33602873
This is an old piece of code (I would not use EREG today) but it demonstrates how I tested the user agent.  I used this to kill script outputs when I did not want the pages indexed by some of the 'bots.  Maybe a similar adaptation will work for you.

You might want to read about this:
http://www.robotstxt.org/

And this...
http://en.wikipedia.org/wiki/List_of_search_engines#Based_on

An interesting exercise might be to capture REQUEST_URI, HTTP_REFERER and HTTP_USER_AGENT fields and write these to a data base (or mail them to yourself).  You will see the patterns developing very quickly, I'm sure.  

Regards, ~Ray

<?php
function bad_robots() 
{
    //
    // DENY SOME SCRIPTS TO ROBOTS
    //
    $bad_robots[]='crawler';
    $bad_robots[]='spider';
    $bad_robots[]='robot';
    $bad_robots[]='slurp';
    $bad_robots[]='Atomz';
    $bad_robots[]='googlebot';
    $bad_robots[]='VoilaBot';
    $bad_robots[]='msnbot';
    $bad_robots[]='Gaisbot';
    $bad_robots[]='Gigabot';
    $bad_robots[]='SBIder';
    $bad_robots[]='Zyborg';
    $bad_robots[]='FunWebProducts';
    $bad_robots[]='findlinks';
    $bad_robots[]='ia_archiver';
    $bad_robots[]='MJ12bot';
    $bad_robots[]='Ask Jeeves';
    $bad_robots[]='NG/2.0';
    $bad_robots[]='voyager';
    $bad_robots[]='Exabot';
    $bad_robots[]='Nutch';
    $bad_robots[]='Hercules';
    $bad_robots[]='psbot';
    $bad_robots[]='LocalcomBot';

    // GET THE AGENT
    $agt = $_SERVER["HTTP_USER_AGENT"];
    $bad = FALSE;
    
    // TEST IF AGENT IN LIST
    foreach ($bad_robots as $spider) 
    {
        if (eregi("$spider", $agt)) { $bad++; }
    }
    
    // AGENT FOUND?
    if ($bad) return true;
    return false; 
}

Open in new window

0

Featured Post

Secure Your WordPress Site: 5 Essential Approaches

WordPress is the web's most popular CMS, but its dominance also makes it a target for attackers. Our eBook will show you how to:

Prevent costly exploits of core and plugin vulnerabilities
Repel automated attacks
Lock down your dashboard, secure your code, and protect your users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this. Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it i…
Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to dynamically set the form action using jQuery.

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question