?
Solved

How does EE let Google Index its Solutions, but not allow users to see it without login.

Posted on 2006-05-13
11
Medium Priority
?
175 Views
Last Modified: 2010-04-27
Hi,

I am in the process of building a website that I want Google to index, but I want the user to be asked to login if they click through on the link.

How does Experts Exchange do this? The only way I can think of so far it they are looking at the User agent, and if it is Googlebot, they are then looking at the IP range to ensure that it is Google, and if these two conditions are met, they are letting them view the full page, otherwise the user is asked to login.

The problem with this method is is that I will need to know all of the IP ranges Google comes from.. So I guess my questions are:

Is this the method EE uses?

If not, what is a better method for doing this?

If it is, does anyone know what IP ranges Googles Indexers use?

Thanks
Daniel
0
Comment
Question by:danielparkerNZ
  • 6
  • 2
  • 2
  • +1
11 Comments
 
LVL 10

Expert Comment

by:gangwisch
ID: 16676255
they most likely use the http_refer meaning that if the refer="google.com" then display this html
0
 
LVL 21

Assisted Solution

by:Julian Matz
Julian Matz earned 1500 total points
ID: 16676353
You do not need to know the IP range... You can do a hostname check. Convert the IP to hostname using reverseip or something and then check against

*.googlebot.com
0
 
LVL 21

Assisted Solution

by:Julian Matz
Julian Matz earned 1500 total points
ID: 16676361
This range belongs to Google:
66.249.64.0 - 66.249.95.255

I don't know how many other ranges, if any, they have... But your safest bet is to use the hostname. It should always be *.googlebot.com....
0
Sign your company up to try the MB 660 headset now

Take control and stay focused in noisy open office environments with the MB 660. By reducing background noise, you can revitalize your office and improve concentration.

 
LVL 4

Expert Comment

by:John-Bayles
ID: 16681687
its simple! and does not involve the ip address!
if it where to involve the ip address what would happen if the site was indexed by MSN or yahoo?

it would be done using php and cookies!

                        You click the link in google
                                         |
                      ---------------------------------
                                          |
                                   Goto Website
                                           |
                        Check for cookie saying user is active            
                            |                                  |
                 User Active                        User Not Active
                      |                                         |
                      |                               Show Question only      
                      |                                         |
                      |                        User Enters Username And Password
                        \                                       /
                          \                                   /
                          Show Question and Answers  

Well anyway this is the kind of structure id use. When the user clicks the link in google it then see's if the user is active if they are then shows question and answer. if not then shows only the question and whe nthe user logs in it shows the question and answer and sets the cookie to say the user is active!                
0
 
LVL 4

Expert Comment

by:John-Bayles
ID: 16681693
Also: because when googles spiders scan the page they are not active users they cannot see the page answers!
0
 
LVL 21

Expert Comment

by:Julian Matz
ID: 16681804
Hi John-Bayles,

<< because when googles spiders scan the page they are not active users they cannot see the page answers!
That was the author's question... How to let Google see and index the site properly.

It can be done by checking the hostname and automatically setting the user (Googlebot) active. Obviously, Googlebot cannot literally login, so you can write some php: check hostname, if Google, automatically create an active session and keep it live for a certain period of time:

$IP = isset($_SERVER['REMOTE_ADDR']) ? $_SERVER['REMOTE_ADDR'] : ''
$hostname = gethostbyaddr($hostname);

// I'm not an expert with regular expressions, am just trying to show an example...
if (eregi("[Googelbot]",$hostname)) {
 session_start();
 $_SESSION['username'] = 'Googlebot';
}

if (!isset($_SESSION['username'])) {
 header ("Location: login.php");
 exit;
 // this redirects to login.php, but you can place here whatever code should be executed if there is no active session
}
0
 
LVL 21

Accepted Solution

by:
Julian Matz earned 1500 total points
ID: 16681831
Another method would be to use the PHP get_browser() function
http://ie.php.net/manual/en/function.get-browser.php

$useragent = isset($_SERVER['USER_AGENT']) ? $_SERVER['USER_AGENT'] : '';
$browser_info = get_browser($useragent);

$Crawler = $browser_info['crawler'];

^^
Would return 'Googlebot' for Google, or 'msn' for MSN, etc.

This function depends on the freely available browsercap.ini...
0
 

Author Comment

by:danielparkerNZ
ID: 16684643
It seams to me that it is likely EE use both a reverse IP lookup, along with checking the Useragent..

After doing alot of reserch on this, this seams to be Cloaking.. and against Googles Terms of Service.

If it is cloaking, how are EE not breaking Google's TOS, and not getting banned for the index?
0
 
LVL 21

Expert Comment

by:Julian Matz
ID: 16686819
How have you come to this conclusion?

Have you ever looked at Google's cache of the site? The cache looks pretty much like the site would if I wasn't logged in. The answers are there, but you have to do a lot of scrolling to see it... And of course, there's the Intellitext ads...
0
 
LVL 21

Expert Comment

by:Julian Matz
ID: 16686828
Ok, not all solutions have the answers without being logged in, but a lot of them do...
0
 

Author Comment

by:danielparkerNZ
ID: 16687682
Hmm.. I wasn't aware of that. Well I guess they are not Cloaking then. I hadn't checked the cache.

0

Featured Post

Eye-catchers on the conference table

Challenge: The i-unit group was not satisfied with the audio quality during remote meetings. They were looking for a portable solution with excellent audio quality for use in their conference room but also at their client’s offices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preparing an email is something we should all take special care with – especially when the email is for somebody you may not know very well. The pressures of everyday working life stacked with a hectic office environment can make this a real challen…
Read this article and get to know some best tips for outsourcing client PPC work to a white label PPC agency.
Use Wufoo, an online form creation tool, to make powerful forms. Learn how to selectively show certain fields based on user input using rules to gather relevant information and data from your forms. The rules feature provides you with an opportunity…
Learn how to set-up custom confirmation messages to users who complete your Wufoo form. Include inputs from fields in your form, webpage redirects, and more with Wufoo’s confirmation options.
Suggested Courses

862 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question