• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 260
  • Last Modified:

Problems with my asp.net website and something google is doing

A little background;

We used to run our site using a cookie mechanism. When a user logged in, it would store a reference for that session in that cookie and our profile/security system would run of this reference. Our client complained to us that they were getting emails from users stating they were unable to login. Well, they could login, but were asked to login again over and over, basically, because the cookie could not be read. Tried to explain this to client with a resolution but it was not accepted. We were told to find a workaround.

The workaround was to pass the session reference in the URL for each page. We would then validate this as and when required. Obviously, this is a security rish from session hijacking, so we put various steps in place to hopefully counter this. Those were;

Store IP in the session details in our DB
Continue to store the cookie, but not make our sesssions reliant on it.

When a request needed to be validated, we would check the following;

Do the IPs match.
Do the cookies match.

In some scneraios, one of these tests could fail;

If the user does not allow cookies, the IPs in theory would still match.
If the user uses a proxy system that used multiple IPs, the cookie would still match

However, if both tests fail we would then class the session as hijacked and remove it from our system, therefore hopefully removing any risks.

Now, this does not cover all scenarios, but it seemed a good start. If a user refused cookies and used a mutliple IP proxy system (I think AOL does), they would not be able to login. I would try to find a resolution to this as and when it does arrive.

However, we received an email today from a user stating that the system kept bouncing her to the login page and from our logs I could see this was because both of the checks were failing. When I went it to investigate, I noticed that each login request was followed within a few seconds with a request from 'googlebot' and this was the cause of the session being killed, an example of the log file is below (not full log information, but enough;

User logs in;

2011-02-12 14:14:20 10.0.95.1 GET /login.aspx - 80 - 87.*.*.*
2011-02-12 14:14:42 10.0.95.1 POST /login.aspx - 80 - 87.*.*.*

Gets redirected to main page with session reference attached and cookie stored

2011-02-12 14:14:42 10.0.95.1 GET /default.aspx id=1c55c55ee1b34ad6ae43c59ad7c2802e 80 - 87.*.*.*

2 seconds later, the request comes in from another IP (resolved to crawl-66-249-71-237.googlebot.com).

2011-02-12 14:14:45 10.0.95.1 GET /default.aspx id=1c55c55ee1b34ad6ae43c59ad7c2802e 80 - 66.249.71.237 Mediapartners-Google - - 302 0 0 374 187

Session is killed here because neither IP or cookie match.

This is repeated 5 times until we get the email from the user. So not really sure what I can do about this and more curious about why google is doing this and are they allowed to.






0
officedog
Asked:
officedog
  • 2
1 Solution
 
BurniePCommented:
Another idea could be to stoe the SessionID from the Session object in the database, instead of the IP address.  The Session.SessionID is the same for the duration of the session of the client on the website.

About Google,  I believe they are scraping your website to optimize their search engine.  They do it to every website querying for keywords so your website can appear better in the search results.

0
 
madginoCommented:
Probably the client has google toolbar installed with some specific settings and this generates the requests.
As long as the client accepted the terms and conditions when installing the toolbars I can see any problem on why google is doing this.

As far as I see it you are really in an impossible situation, all you can do is test and advice the client how to configure the browser/toolbar.
0
 
officedogAuthor Commented:
Your comment regarding the SessionID from the asp.net session is a good one. This could be a further check.

However, I understand how standard search engine spidering works, but it seems google are using the hijacked session URL to revisit the site. Of course they are probably not doing anything malicious, but it does raise a question of why and what it is they are actually doing.
0
 
BurniePCommented:
Hi,

I found this wikipedia about googlebot : http://en.wikipedia.org/wiki/Googlebot

You can find more information about it by reading this or googling googlebot.  I would not be too worry about it since they are not malicious and are just trying to get your website up in the search rankings.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now