?
Solved

Need urgent help... GoogleBot and Database sessions, my god they hitting me hard!

Posted on 2008-06-15
12
Medium Priority
?
361 Views
Last Modified: 2013-12-09
Hey there guys,

Man i have some serious problems here... My site sets a session record in my database when a user enters my site, this is sued for any number of things, cart items, user settings etc...

As of about 6/15/2008 10:24:37 AM Google (GoogleBot i am assuming) from the following IP 66.249.65.170 began hitting my website... They seem to be hitting each linked page of my site, currently it is 292 sessions and counting... For each page it hits it creates a new session in my database and she is filling up fast!

Does anyone know how i can prevent this... I want then to crawl my website but this is out of control... Is there a way to ensure that GoogleBot holds a single session?????

Any quick help would be great i am close to just shutting the site down until it stops hitting me...

Nugs
0
Comment
Question by:Nugs
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 3
  • 2
12 Comments
 
LVL 2

Author Comment

by:Nugs
ID: 21790128
Well i want them to crawl my site i would just like them to do it with a single session id...

Nugs
0
 
LVL 4

Accepted Solution

by:
redcelltech earned 1600 total points
ID: 21790155
You could check for a a browser type before building your session, or more importantly find code that checks for crawlers or robots before building the session and then assign you static one.
0
Setting up LaraDock for Laravel

Learn how to set up LaraDock in a Laravel project - LaraDock gives us an easy way to run a Laravel application using Docker in a single command.

 
LVL 2

Author Comment

by:Nugs
ID: 21790177
that is a good idea... let me try that i will get back to you!

Nugs
0
 
LVL 29

Assisted Solution

by:fibo
fibo earned 400 total points
ID: 21791461
Buil the session ID from the IP address?
0
 
LVL 2

Author Comment

by:Nugs
ID: 21806033
Well i took your advice... Please do let me know if this can be improved in any way... It seems to catch GoogleBot successfully...

I now use Session["SessID"] rather than Session.SessionID...

In addition to this i check JavaScript and Cookies are enabled. And redirect if not... No cookie support was also support was also creating some duplicate session records.

I STILL get some users that create duplicate session records. And for the life of my i can no figure out why or how... I am even catching mobile devices and redirecting those as some of those can't hold cookies either.

So the attached code should create a session for Google or other bots and gives them a session for an hour...

Here is a example of a recorded ip that is creating duplicate session records...

96.248.253.128         UNITED STATES
VERIZON INTERNET SERVICES INC       VERIZON.NET

i suspect these people are mobile devices... I am going to do a seriouse overhaul of my session_start methods to check a number of different factors, IP, timestamp, etc... and try and see if i can do the same for all users as i am doing with the bots and create static sessions for these users.... i dunno... kinda at a lose here...

Nugs
        //CHECK IF SEARCH BOT/SPIDER
        bool IsBot = false;
        string BotName = "";
        
        if (HttpContext.Current.Request.Browser.Crawler)
        {
            IsBot = true;
            BotName = "BotSpider";
        }
        else
        {
            string userAgent = HttpContext.Current.Request.UserAgent.ToLower();
            string[] botKeywords = new string[10] { "bot", "spider", "google", "yahoo", "search", "crawl", "slurp", "msn", "teoma", "ask.com" };
            foreach (string bot in botKeywords)
            {
                if (userAgent.Contains(bot))
                {
                    IsBot = true;
                    BotName = bot;
                }
            }
        }
        
        if (IsBot)
        {
            Session["SessID"] = BotName + DateTime.Now.Day + DateTime.Now.Month + DateTime.Now.Year + "-" + DateTime.Now.Hour;
        }
        else
        {
            if (Session["SessID"] == null)
            {
                Session["SessID"] = Session.SessionID;
            }
        }

Open in new window

0
 
LVL 2

Author Closing Comment

by:Nugs
ID: 31467426
I have changed the entire way i build the session ID for these records... Instead of using the browser session ID i opted to build static session ids based on the ip and date and a few other variables... this should allow a single user to hold a session for the day in my DB... Thanks
0
 
LVL 29

Expert Comment

by:fibo
ID: 21807920
I would fear the mobile devices to use a limited number of IP addresses. Thus applying to them a sessionID based on IP might create a problem. I would probably, if possible, limit IP-based session ID to robots.
0
 
LVL 2

Author Comment

by:Nugs
ID: 21807978
I am trying to redirect mobile devices all together as they would most likely not be able to view my website properly anyways. But as i mentioned, there are some users (possibly uncaught mobile devices) that create multiple sessions on the same IP...

Also the static sessionid's i creat based on the ip address also include month day and year and other variables so that older sessions from the same ip creat a new session.

And if a user, mobile device or not, does hit the site via another ip, a new session should and will be created anyway.

Is there something i am missing with this that might not make it work correctly? I would love to use the browser session id but it seems that some users just don't hold this value even though i insist on cookies and sessions be enabled...

Nugs
0
 
LVL 2

Author Comment

by:Nugs
ID: 21808097
You know, now that i think about it this might not work at all... I have not taken into consideration networks. Multiple users on the same network (maybe at work) would generate the same IP address...

:(
0
 
LVL 2

Author Comment

by:Nugs
ID: 21808193
fibo: Thanks for the heads up... I have really been looking so hard at a solution to all the duplicate session records being generated that i have not considered all the scenarios... You are right it would be unsafe to place every user in with a static session id... I can not simply look at the IP and determine that that is a single user, it may very well be a network... in fact (although unlikely because the sessions are seconds apart) the duplicate session records i am seeing may very well be two separate computer on the same network. Either way there is no way for me to tell this really...

Attached is my revised "session creating" snippet... Hopefully this will catch every or most duplicating sessions... and the ones it does not catch, well, i don't see how i can catch those...

It should catch Mobile devices, bots and spiders and set a static session for them and then leave everything else to Session.SessionID

Nugs

        //CHECK IF SEARCH BOT/SPIDER
        //--------------------------
        bool IsUncomonUser = false;
        string UncomonName = "";
 
        if (HttpContext.Current.Request.Browser.IsMobileDevice)
        {
            IsUncomonUser = true;
            UncomonName = "mobile";
        }
        if (HttpContext.Current.Request.Browser.Crawler)
        {
            IsUncomonUser = true;
            UncomonName = "botspider";
        }
        else
        {
            string userAgent = HttpContext.Current.Request.UserAgent.ToLower();
            string[] botKeywords = new string[10] { "bot", "spider", "google", "yahoo", "search", "crawl", "slurp", "msn", "teoma", "ask.com" };
            foreach (string bot in botKeywords)
            {
                if (userAgent.Contains(bot))
                {
                    IsUncomonUser = true;
                    UncomonName = bot;
                }
            }
        }
 
        if (IsUncomonUser)
        {
            Session["SessID"] = UncomonName + SessionIP + SessionDate + SessionBrowser;
        }
        else
        {
            Session["SessID"] = HttpContext.Current.Session.SessionID;
        }

Open in new window

0
 
LVL 29

Expert Comment

by:fibo
ID: 21808371
Seems fine now!
0

Featured Post

Quick Start: DOCKER

Sometimes you just need a Quick Start on a topic in order to begin using it.. this is just what you need to know to get up and running with Docker!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

FAQ pages provide a simple way for you to supply and for customers to find answers to the most common questions about your company. Here are six reasons why your company website should have a FAQ page
Color can increase conversions, create feelings of warmth or even incite people to get behind a cause. If you want your website to really impact site visitors, then it is vital to consider the impact color has on them.
The viewer will learn how to dynamically set the form action using jQuery.
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.
Suggested Courses

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question