Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1310
  • Last Modified:

Parsing the HTTP REFERER variable to detect the type of refering site

Hello!

I collect the visits to my web site, storing all HTTP REFERER variables into a database.

Now from this variable, I would like to detect if the referer is :

- a newsgroup

- Google

- another search engine than Google

- a direct access (url typed in, or a browser bookmark)

- a web site other than my web site and other than a search engine

- my web site

Any clue?

Regards
Stephane
0
stephaneeybert
Asked:
stephaneeybert
  • 5
  • 4
1 Solution
 
COBOLdinosaurCommented:
Im not sure what you are asking for.

The referrer is just a string containing the url where the link to your page was clicked. if the address was entered directly in the address bar or was click in favourites then teh referrer is empty; otherwise it is the url where the link was clicked.  I don't believe there is an detail beyond that, so yuo would just have to paresr the url to get the site name.

Cd&
0
 
stephaneeybertAuthor Commented:
Hello,

Thanks for the comment. In fact I know that. My question is about how to parse it, what to look for, to get the informations I want...

Cheers
Stephane
0
 
stephaneeybertAuthor Commented:
What logic to put in the parsing to get the details I need...
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
COBOLdinosaurCommented:
There are so many possible variations on Urls that parsing out specifics will almost require a regular experession for just about each instance tht you are looking for.  What is it you are trying to parse out?  Why you not just use the whole url?

Cd&
0
 
stephaneeybertAuthor Commented:
I'm doing a web site and I made a page to show the visitors and visits statistics.
The page shows which browsers are being used, which operating systems, how many visitors and visits per month... Now I would like to complete the work by adding in the page, where the visitors came from when visiting the web site. I would like to display how many came from:
- a newsgroup

- Google

- another search engine than Google

- a direct access (url typed in, or a browser bookmark)

- a web site other than my web site and other than a search engine

- my web site

And I'll display it with a graph (that part I know how to do).

The only thing that is hard for me to do, is how to parse the urls, with regular expressions, to retrieve the matches against the 6 options listed before.

Say, how to parse the url to check if it comes from a newsgroup, and if not, if it comes from Google...

Regards
Stephane
0
 
stephaneeybertAuthor Commented:
Doing a regular expression for each option that I am looking for is fine with me. Only, I'm no good with regular expressions...
0
 
COBOLdinosaurCommented:
You don't understand ther are hundreds of thousands of news groups maybe over a million.
There is nothing that indicates they are a news group.

There are thousands of search engines, and there ae seach engines that serach otther search engines.  There is nothing that indicates they are a search engine.

You would need to have a database with the names of all the news groups and all the search engines and test against that.  Even if you keep it to a short list there is still problem. consider Yahoo.  what does Yahoo.com tell you?  They have a search engine.  They also have nes groups.  They have email which might contain links.  They host virtual domains.  Google is the same way.  Google.com==search || ==gmail || ==newsgroup.

What I would suggest is that you look at the urls of specific sites you want to track and then you just have to do a simple substring search for them, and you won't need regexp:

In JavaScript I would do something like this:

if (referrer=='')
alert('this is an unknown site')
if (referrer.toLowerCase().indexOf('google.com') !=-1)
alert ('this is google');
else if (referrer.toLowerCase().indexOf('yahoo.com') !=-1)
alert ('this is yahoo');
else if (referrer.toLowerCase().indexOf('yoursite.com') !=-1)
alert ('this is yoursite');
else
alert('this is another site');

Cd&
0
 
stephaneeybertAuthor Commented:
Yeah, I started doing a strstr() search for the Google case.

I'll do the same with the web site hostname for the internal hits.

Thanks anyway

Cheers

Steph
0
 
COBOLdinosaurCommented:
Glad I could help.  Thanks for the A. :^)

Cd&
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 5
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now