Solved

Need urgent help... GoogleBot and Database sessions, my god they hitting me hard!

Posted on 2008-06-15
12
341 Views
Last Modified: 2013-12-09
Hey there guys,

Man i have some serious problems here... My site sets a session record in my database when a user enters my site, this is sued for any number of things, cart items, user settings etc...

As of about 6/15/2008 10:24:37 AM Google (GoogleBot i am assuming) from the following IP 66.249.65.170 began hitting my website... They seem to be hitting each linked page of my site, currently it is 292 sessions and counting... For each page it hits it creates a new session in my database and she is filling up fast!

Does anyone know how i can prevent this... I want then to crawl my website but this is out of control... Is there a way to ensure that GoogleBot holds a single session?????

Any quick help would be great i am close to just shutting the site down until it stops hitting me...

Nugs
0
Comment
Question by:Nugs
  • 7
  • 3
  • 2
12 Comments
 
LVL 4

Expert Comment

by:redcelltech
Comment Utility
0
 
LVL 2

Author Comment

by:Nugs
Comment Utility
Well i want them to crawl my site i would just like them to do it with a single session id...

Nugs
0
 
LVL 4

Accepted Solution

by:
redcelltech earned 400 total points
Comment Utility
You could check for a a browser type before building your session, or more importantly find code that checks for crawlers or robots before building the session and then assign you static one.
0
 
LVL 2

Author Comment

by:Nugs
Comment Utility
that is a good idea... let me try that i will get back to you!

Nugs
0
 
LVL 29

Assisted Solution

by:fibo
fibo earned 100 total points
Comment Utility
Buil the session ID from the IP address?
0
 
LVL 2

Author Comment

by:Nugs
Comment Utility
Well i took your advice... Please do let me know if this can be improved in any way... It seems to catch GoogleBot successfully...

I now use Session["SessID"] rather than Session.SessionID...

In addition to this i check JavaScript and Cookies are enabled. And redirect if not... No cookie support was also support was also creating some duplicate session records.

I STILL get some users that create duplicate session records. And for the life of my i can no figure out why or how... I am even catching mobile devices and redirecting those as some of those can't hold cookies either.

So the attached code should create a session for Google or other bots and gives them a session for an hour...

Here is a example of a recorded ip that is creating duplicate session records...

96.248.253.128         UNITED STATES
VERIZON INTERNET SERVICES INC       VERIZON.NET

i suspect these people are mobile devices... I am going to do a seriouse overhaul of my session_start methods to check a number of different factors, IP, timestamp, etc... and try and see if i can do the same for all users as i am doing with the bots and create static sessions for these users.... i dunno... kinda at a lose here...

Nugs
        //CHECK IF SEARCH BOT/SPIDER

        bool IsBot = false;

        string BotName = "";

        

        if (HttpContext.Current.Request.Browser.Crawler)

        {

            IsBot = true;

            BotName = "BotSpider";

        }

        else

        {

            string userAgent = HttpContext.Current.Request.UserAgent.ToLower();

            string[] botKeywords = new string[10] { "bot", "spider", "google", "yahoo", "search", "crawl", "slurp", "msn", "teoma", "ask.com" };

            foreach (string bot in botKeywords)

            {

                if (userAgent.Contains(bot))

                {

                    IsBot = true;

                    BotName = bot;

                }

            }

        }

        

        if (IsBot)

        {

            Session["SessID"] = BotName + DateTime.Now.Day + DateTime.Now.Month + DateTime.Now.Year + "-" + DateTime.Now.Hour;

        }

        else

        {

            if (Session["SessID"] == null)

            {

                Session["SessID"] = Session.SessionID;

            }

        }

Open in new window

0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 2

Author Closing Comment

by:Nugs
Comment Utility
I have changed the entire way i build the session ID for these records... Instead of using the browser session ID i opted to build static session ids based on the ip and date and a few other variables... this should allow a single user to hold a session for the day in my DB... Thanks
0
 
LVL 29

Expert Comment

by:fibo
Comment Utility
I would fear the mobile devices to use a limited number of IP addresses. Thus applying to them a sessionID based on IP might create a problem. I would probably, if possible, limit IP-based session ID to robots.
0
 
LVL 2

Author Comment

by:Nugs
Comment Utility
I am trying to redirect mobile devices all together as they would most likely not be able to view my website properly anyways. But as i mentioned, there are some users (possibly uncaught mobile devices) that create multiple sessions on the same IP...

Also the static sessionid's i creat based on the ip address also include month day and year and other variables so that older sessions from the same ip creat a new session.

And if a user, mobile device or not, does hit the site via another ip, a new session should and will be created anyway.

Is there something i am missing with this that might not make it work correctly? I would love to use the browser session id but it seems that some users just don't hold this value even though i insist on cookies and sessions be enabled...

Nugs
0
 
LVL 2

Author Comment

by:Nugs
Comment Utility
You know, now that i think about it this might not work at all... I have not taken into consideration networks. Multiple users on the same network (maybe at work) would generate the same IP address...

:(
0
 
LVL 2

Author Comment

by:Nugs
Comment Utility
fibo: Thanks for the heads up... I have really been looking so hard at a solution to all the duplicate session records being generated that i have not considered all the scenarios... You are right it would be unsafe to place every user in with a static session id... I can not simply look at the IP and determine that that is a single user, it may very well be a network... in fact (although unlikely because the sessions are seconds apart) the duplicate session records i am seeing may very well be two separate computer on the same network. Either way there is no way for me to tell this really...

Attached is my revised "session creating" snippet... Hopefully this will catch every or most duplicating sessions... and the ones it does not catch, well, i don't see how i can catch those...

It should catch Mobile devices, bots and spiders and set a static session for them and then leave everything else to Session.SessionID

Nugs

        //CHECK IF SEARCH BOT/SPIDER

        //--------------------------

        bool IsUncomonUser = false;

        string UncomonName = "";
 

        if (HttpContext.Current.Request.Browser.IsMobileDevice)

        {

            IsUncomonUser = true;

            UncomonName = "mobile";

        }

        if (HttpContext.Current.Request.Browser.Crawler)

        {

            IsUncomonUser = true;

            UncomonName = "botspider";

        }

        else

        {

            string userAgent = HttpContext.Current.Request.UserAgent.ToLower();

            string[] botKeywords = new string[10] { "bot", "spider", "google", "yahoo", "search", "crawl", "slurp", "msn", "teoma", "ask.com" };

            foreach (string bot in botKeywords)

            {

                if (userAgent.Contains(bot))

                {

                    IsUncomonUser = true;

                    UncomonName = bot;

                }

            }

        }
 

        if (IsUncomonUser)

        {

            Session["SessID"] = UncomonName + SessionIP + SessionDate + SessionBrowser;

        }

        else

        {

            Session["SessID"] = HttpContext.Current.Session.SessionID;

        }

Open in new window

0
 
LVL 29

Expert Comment

by:fibo
Comment Utility
Seems fine now!
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit (http://en.wikipedia.org/wiki/PHPUnit) and similar technologies have enjoyed wide adoption, making it possib…
Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
Viewers will get an overview of the benefits and risks of using Bitcoin to accept payments. What Bitcoin is: Legality: Risks: Benefits: Which businesses are best suited?: Other things you should know: How to get started:
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now