Solved

Google Bots - Killing the Server

Posted on 2014-04-14
7
481 Views
Last Modified: 2014-04-15
Hi

We do web hosting and some of the websites hosted at our servers have something that Google Bots keep working with them so often and for very long time that the server becomes slow. Sometimes it is over a million hits to one website by Google bot in a day.

We are trying to find out a solution that even if our customer has not configured its website correctly with Google bot/Webmaster tools Google cannot do that much hits to our server.
Currently in such cases we block the Google Bot IPs in IPTables and the servers become very good but in that case the customers having good websites also suffer.

Can someone please suggest a solution to this?

We are running CentOS 6.5 64bit and using NginX and Apache at our servers.
0
Comment
Question by:sysautomation
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 25

Expert Comment

by:Zephyr ICT
ID: 39998621
Did you already look into the Webmaster Tools of Google? You'll have to create an account if you don't have one though

You can limit the Googlebot Crawling rate: https://support.google.com/webmasters/answer/48620?hl=en

Might be worth checking out?
0
 
LVL 52

Expert Comment

by:Scott Fell, EE MVE
ID: 39998753
That sounds odd for google.  Is there a special app you have created?  Or is there one domain that has an issue?  I would look for the page that is causing the problem and send a note to the domain owner to fix their page / limit google or be turned off.

It sounds like they must have a dynamic page with a lot of links and the queries they use take up a lot of resources.  

In any case, it you probably  have to have the domain owner take care of it or limit/shut off their service.
0
 

Author Comment

by:sysautomation
ID: 39998830
Yes it is dynamic. We are hosting Oracle Apex applications and have little control over customers except to force them when the server is in problem. But what I really look for is some preventive measure.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 52

Accepted Solution

by:
Scott Fell,  EE MVE earned 500 total points
ID: 39998849
I think as the server owner, all you can do is determine which customer and send them a notice they are using up more than their allotted resources.    Googlebot listens to the domains, not the server.  If you have control of the domain, then you can use webmaster tools to limit or you can use robots.txt to prevent googlebot from crawling a folder or you can use a noindex tag on a page to prevent it from crawling the page https://support.google.com/webmasters/answer/93708

You can also set up your serverside programming to prevent one user from paging through too many pages in a certain time.

In any case, this is a domain function and not a server function as far as being able to tell googlebot what to do.
0
 
LVL 15

Expert Comment

by:Giovanni Heward
ID: 40000140
Bear in mind the user-agent can easily be spoofed, so the bot may not actually belong to google.  (Verify the IP with ARIN to confirm.)
0
 
LVL 52

Expert Comment

by:Scott Fell, EE MVE
ID: 40000210
That is a great point!  

Like I said http:#a39998753, this did not sound right for google.
0
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 40000293
Here's what Google says about verifying their Googlebots: https://support.google.com/webmasters/answer/80553?hl=en
0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Application launch issue with Apache Tomcat 5 70
Apache / XAMPP  authorisation 10 70
Old links on my blog create crawl errors 4 125
PHP_POST() error message 9 79
A/B testing is a simple and effective trick to get to know your audience, increase website conversions and make the most out of your online ad campaigns. It's widely available and doesn't need much tech knowledge to be executed, but the results it y…
Because your company can’t afford for you to make SEO mistakes, you’ll want to ensure you’re taking the right steps each and every time you post a new piece of content. This list of optimization do’s and don’ts can help you become an SEO wizard.
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
This tutorial walks through the best practices in adding a local business to Google Maps including how to properly search for duplicates, marker placement, and inputing business details. Login to your Google Account, then search for "Google Mapmaker…

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question