A little background;
We used to run our site using a cookie mechanism. When a user logged in, it would store a reference for that session in that cookie and our profile/security system would run of this reference. Our client complained to us that they were getting emails from users stating they were unable to login. Well, they could login, but were asked to login again over and over, basically, because the cookie could not be read. Tried to explain this to client with a resolution but it was not accepted. We were told to find a workaround.
The workaround was to pass the session reference in the URL for each page. We would then validate this as and when required. Obviously, this is a security rish from session hijacking, so we put various steps in place to hopefully counter this. Those were;
Store IP in the session details in our DB
Continue to store the cookie, but not make our sesssions reliant on it.
When a request needed to be validated, we would check the following;
Do the IPs match.
Do the cookies match.
In some scneraios, one of these tests could fail;
If the user does not allow cookies, the IPs in theory would still match.
If the user uses a proxy system that used multiple IPs, the cookie would still match
However, if both tests fail we would then class the session as hijacked and remove it from our system, therefore hopefully removing any risks.
Now, this does not cover all scenarios, but it seemed a good start. If a user refused cookies and used a mutliple IP proxy system (I think AOL does), they would not be able to login. I would try to find a resolution to this as and when it does arrive.
However, we received an email today from a user stating that the system kept bouncing her to the login page and from our logs I could see this was because both of the checks were failing. When I went it to investigate, I noticed that each login request was followed within a few seconds with a request from 'googlebot' and this was the cause of the session being killed, an example of the log file is below (not full log information, but enough;
User logs in;
2011-02-12 14:14:20 10.0.95.1 GET /login.aspx - 80 - 87.*.*.*
2011-02-12 14:14:42 10.0.95.1 POST /login.aspx - 80 - 87.*.*.*
Gets redirected to main page with session reference attached and cookie stored
2011-02-12 14:14:42 10.0.95.1 GET /default.aspx id=1c55c55ee1b34ad6ae43c59ad7c2802e 80 - 87.*.*.*
2 seconds later, the request comes in from another IP (resolved to crawl-66-249-71-237.googlebot.com).
2011-02-12 14:14:45 10.0.95.1 GET /default.aspx id=1c55c55ee1b34ad6ae43c59ad7c2802e 80 - 18.104.22.168 Mediapartners-Google - - 302 0 0 374 187
Session is killed here because neither IP or cookie match.
This is repeated 5 times until we get the email from the user. So not really sure what I can do about this and more curious about why google is doing this and are they allowed to.