Solved

Parsing the HTTP REFERER variable to detect the type of refering site

Posted on 2004-07-31
9
1,296 Views
Last Modified: 2008-03-06
Hello!

I collect the visits to my web site, storing all HTTP REFERER variables into a database.

Now from this variable, I would like to detect if the referer is :

- a newsgroup

- Google

- another search engine than Google

- a direct access (url typed in, or a browser bookmark)

- a web site other than my web site and other than a search engine

- my web site

Any clue?

Regards
Stephane
0
Comment
Question by:stephaneeybert
  • 5
  • 4
9 Comments
 
LVL 53

Expert Comment

by:COBOLdinosaur
ID: 11685161
Im not sure what you are asking for.

The referrer is just a string containing the url where the link to your page was clicked. if the address was entered directly in the address bar or was click in favourites then teh referrer is empty; otherwise it is the url where the link was clicked.  I don't believe there is an detail beyond that, so yuo would just have to paresr the url to get the site name.

Cd&
0
 

Author Comment

by:stephaneeybert
ID: 11685879
Hello,

Thanks for the comment. In fact I know that. My question is about how to parse it, what to look for, to get the informations I want...

Cheers
Stephane
0
 

Author Comment

by:stephaneeybert
ID: 11685928
What logic to put in the parsing to get the details I need...
0
Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 
LVL 53

Expert Comment

by:COBOLdinosaur
ID: 11685959
There are so many possible variations on Urls that parsing out specifics will almost require a regular experession for just about each instance tht you are looking for.  What is it you are trying to parse out?  Why you not just use the whole url?

Cd&
0
 

Author Comment

by:stephaneeybert
ID: 11687003
I'm doing a web site and I made a page to show the visitors and visits statistics.
The page shows which browsers are being used, which operating systems, how many visitors and visits per month... Now I would like to complete the work by adding in the page, where the visitors came from when visiting the web site. I would like to display how many came from:
- a newsgroup

- Google

- another search engine than Google

- a direct access (url typed in, or a browser bookmark)

- a web site other than my web site and other than a search engine

- my web site

And I'll display it with a graph (that part I know how to do).

The only thing that is hard for me to do, is how to parse the urls, with regular expressions, to retrieve the matches against the 6 options listed before.

Say, how to parse the url to check if it comes from a newsgroup, and if not, if it comes from Google...

Regards
Stephane
0
 

Author Comment

by:stephaneeybert
ID: 11687006
Doing a regular expression for each option that I am looking for is fine with me. Only, I'm no good with regular expressions...
0
 
LVL 53

Accepted Solution

by:
COBOLdinosaur earned 125 total points
ID: 11688322
You don't understand ther are hundreds of thousands of news groups maybe over a million.
There is nothing that indicates they are a news group.

There are thousands of search engines, and there ae seach engines that serach otther search engines.  There is nothing that indicates they are a search engine.

You would need to have a database with the names of all the news groups and all the search engines and test against that.  Even if you keep it to a short list there is still problem. consider Yahoo.  what does Yahoo.com tell you?  They have a search engine.  They also have nes groups.  They have email which might contain links.  They host virtual domains.  Google is the same way.  Google.com==search || ==gmail || ==newsgroup.

What I would suggest is that you look at the urls of specific sites you want to track and then you just have to do a simple substring search for them, and you won't need regexp:

In JavaScript I would do something like this:

if (referrer=='')
alert('this is an unknown site')
if (referrer.toLowerCase().indexOf('google.com') !=-1)
alert ('this is google');
else if (referrer.toLowerCase().indexOf('yahoo.com') !=-1)
alert ('this is yahoo');
else if (referrer.toLowerCase().indexOf('yoursite.com') !=-1)
alert ('this is yoursite');
else
alert('this is another site');

Cd&
0
 

Author Comment

by:stephaneeybert
ID: 11688355
Yeah, I started doing a strstr() search for the Google case.

I'll do the same with the web site hostname for the internal hits.

Thanks anyway

Cheers

Steph
0
 
LVL 53

Expert Comment

by:COBOLdinosaur
ID: 11688560
Glad I could help.  Thanks for the A. :^)

Cd&
0

Featured Post

ScreenConnect 6.0 Free Trial

At ScreenConnect, partner feedback doesn't fall on deaf ears. We collected partner suggestions off of their virtual wish list and transformed them into one game-changing release: ScreenConnect 6.0. Explore all of the extras and enhancements for yourself!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
WSDL Soap Error 12 269
XML problem with Internet Explorer 7 45
RSS Feeds--IE 13 179
installing and using WTP plugin eclipse MARS 3 112
Introduction Knockoutjs (Knockout) is a JavaScript framework (Model View ViewModel or MVVM framework).   The main ideology behind Knockout is to control from JavaScript how a page looks whilst creating an engaging user experience in the least …
Have you tried to learn about Unicode, UTF-8, and multibyte text encoding and all the articles are just too "academic" or too technical? This article aims to make the whole topic easy for just about anyone to understand.
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question