Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Restrict bots from crawling directories

Posted on 2010-08-16
6
Medium Priority
?
473 Views
Last Modified: 2012-05-10
Google has index one of my directories and showing in search results, to stop the directory and its sub directories I was thinking to restrict bots from accessing the directory, how can I do that?
0
Comment
Question by:sahanz
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 5

Accepted Solution

by:
stermeau earned 1000 total points
ID: 33444283
You can use a robots.txt file.
There are some simple examples here : http://www.robotstxt.org/orig.html

But you should also modify your web server configuration to disable directory listing.
0
 
LVL 7

Expert Comment

by:marektech
ID: 33444325
You could also us the following:

<meta name="robots" content="noindex,nofollow">

http://www.heritage-tech.net/188/alternative-to-using-robotstxt/
0
 
LVL 7

Expert Comment

by:marektech
ID: 33444340
More information about the robots metatag:

http://www.robotstxt.org/meta.html
0
Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

 
LVL 1

Author Comment

by:sahanz
ID: 33445375
if I add those lines to the index file of the directory, will it stop from crawling sub directories?
0
 
LVL 7

Assisted Solution

by:marektech
marektech earned 1000 total points
ID: 33445538
You can use the robots.txt option on the root of your website and specify the directories which should be no go areas.

For example:

User-Agent: Googlebot
Disallow: /private/private.htm
Disallow: /secret/

Or via the Meta tag method the tag should be present on each page which is not to be indexed.

<meta name="robots" content="noindex,nofollow">
0
 
LVL 1

Author Closing Comment

by:sahanz
ID: 33455450
Thanks
0

Featured Post

Create the perfect environment for any meeting

You might have a modern environment with all sorts of high-tech equipment, but what makes it worthwhile is how you seamlessly bring together the presentation with audio, video and lighting. The ATEN Control System provides integrated control and system automation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Wondering how you can get your new website ranking on Google Search so that people find you online?  The answer is Search Engine Optimization (SEO). So how do you Start Ranking a New site with SEO?  Below is a starter guide to get your new website S…
In this blog, I will share you some basic tips for content marketing and to rank your website on Google.
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
This Micro Tutorial will demonstrate how to add subdomains to your content reports. This can be very importing in having a site with multiple subdomains.
Suggested Courses

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question