rocketTendon
asked on
File not found (weird string in URL request) from Googlebot
Just went live with our new website...
We're getting a bunch of "File not found" entries in our server logs from IP: 66.249.71.133 (crawl-66-249-71-133.googl ebot.com).
EXAMPLE:
http://www.oursite.com/(f(jjjltgyzhg09ovqh8cyo2l_ztvta_he0oemrrzjk3ny21s4gz1czijojfpcsp6jamgupdk2vcikodwsza8fwputzia4prcpez7hrx5sya8xiwth_tfolwsx3435x45pnbtiov1xmerepxrukoket9ndiwrt09hzoogpyrzpt9phyomwkngj4fj8dd5wfg9tiir9l6notm-gczyfi2m0by2ndhnq1))/nikon_m59.aspx
Initially, I thought maybe IIS 7 was sticking a Session ID into the URL... but the web.config file is configured to NOT put the Session ID in the URL.
Coupled with this... I am unable to replicate this behavior in ANY browser (regardless of whether cookies are turned on or off).
We have Google Analytics installed.
Does Googlebot (in conjunction with Google Analytics or otherwise) inject something into the URL for tracking?
Why is Googlebot making requests to our server with a massive string injected into the URL?
Thanks in advance!
Mike
We're getting a bunch of "File not found" entries in our server logs from IP: 66.249.71.133 (crawl-66-249-71-133.googl
EXAMPLE:
http://www.oursite.com/(f(jjjltgyzhg09ovqh8cyo2l_ztvta_he0oemrrzjk3ny21s4gz1czijojfpcsp6jamgupdk2vcikodwsza8fwputzia4prcpez7hrx5sya8xiwth_tfolwsx3435x45pnbtiov1xmerepxrukoket9ndiwrt09hzoogpyrzpt9phyomwkngj4fj8dd5wfg9tiir9l6notm-gczyfi2m0by2ndhnq1))/nikon_m59.aspx
Initially, I thought maybe IIS 7 was sticking a Session ID into the URL... but the web.config file is configured to NOT put the Session ID in the URL.
<sessionState cookieless="false" mode="InProc" timeout="30" />
Coupled with this... I am unable to replicate this behavior in ANY browser (regardless of whether cookies are turned on or off).
We have Google Analytics installed.
Does Googlebot (in conjunction with Google Analytics or otherwise) inject something into the URL for tracking?
Why is Googlebot making requests to our server with a massive string injected into the URL?
Thanks in advance!
Mike
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
GaryC123 ...
Thanks for the the link... but after using Redleg's tool... I still don't see anything even remotely close to what I'm seeing in the log files. There is no session id being appended or injected into the link HREFs (in the Redleg report) ... the link HREFs look as they should.
bigdogdman...
In my opinion, the strings being injected are way too uniform to be SQL injection... coupled with the fact that they're being appended immediately after the domain name and not after a URL variable.
Note:
The only consistent pattern in the "long string" is that they ALL the log entries in question start with: (f( .... then followed by what seems to be a session idesque string.
Thanks for the the link... but after using Redleg's tool... I still don't see anything even remotely close to what I'm seeing in the log files. There is no session id being appended or injected into the link HREFs (in the Redleg report) ... the link HREFs look as they should.
bigdogdman...
In my opinion, the strings being injected are way too uniform to be SQL injection... coupled with the fact that they're being appended immediately after the domain name and not after a URL variable.
Note:
The only consistent pattern in the "long string" is that they ALL the log entries in question start with: (f( .... then followed by what seems to be a session idesque string.
ASKER
I would post our URL ... but I would like to be able to go back and remove the URL after the issue has been resolved... but it does not appear that I would be able to do that.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Indeed, graham. :-)
I never claimed it *was* SQL injection, as the string contains no SQL to speak of. I only said it *appeared similar*. I'm not up to speed on all the lingo of webserver attacks, but it sounds like Graham is. :-)
...again, not that it *is* an attack; either way, nothing to sweat about. :-)
I never claimed it *was* SQL injection, as the string contains no SQL to speak of. I only said it *appeared similar*. I'm not up to speed on all the lingo of webserver attacks, but it sounds like Graham is. :-)
...again, not that it *is* an attack; either way, nothing to sweat about. :-)
ASKER
That makes sense grahamnonweiler ... although the log entries in question started appearing within 10 minutes of going live with the site... so a link found elsewhere on someone else's website seem unlikely (since the brand new script-names are referenced in the request). That being said, it wouldn't surprise me that there was some bot out there that had a very "quick to market" impact.
Will Googlebot remove it from it's spider cache if the response code being returned is 500?
Will Googlebot remove it from it's spider cache if the response code being returned is 500?
The 500 error (Internal Server Error) will not remove it from the spider cache - it needs to be either a 404 (best) or a 302 (permanently gone.
Do you have any automated feeds for your site - such as Twitter/FB/Amazon - as these could also cause a similar situation - however - the signature (what you are referring to as looking like a session_id) would be very different.
Do you have any automated feeds for your site - such as Twitter/FB/Amazon - as these could also cause a similar situation - however - the signature (what you are referring to as looking like a session_id) would be very different.
ASKER
Using a couple of entries from the log-file (along with Redleg's File Viewer)... I've validated that our server is returning a 404 response code.
Concern aborted.
Thanks guys!
Concern aborted.
Thanks guys!
My dad used to own an ISP, and every day his server would be attacked by IPs all over the globe; he witnessed a variety of different methods, but the most common was SQL/query string injection, and it often looked very similar to that.