Solved

How to write robots.txt to prevent google from crawling a folder in my website

Posted on 2012-04-09
2
260 Views
Last Modified: 2012-04-14
I have a domain called www.domain.com. I dont want google to crawl domain.com/test. How do i do this?
0
Comment
Question by:melwong
2 Comments
 
LVL 17

Accepted Solution

by:
Anuroopsundd earned 50 total points
ID: 37822925
The "/robots.txt" file is a text file, with one or more records. Usually contains a single record looking like this:
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~joe/
In this example, three directories are excluded.

Note that you need a separate "Disallow" line for every URL prefix you want to exclude -- you cannot say "Disallow: /cgi-bin/ /tmp/" on a single line. Also, you may not have blank lines in a record, as they are used to delimit multiple records.

Note also that globbing and regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "User-agent: *bot*", "Disallow: /tmp/*" or "Disallow: *.gif".

What you want to exclude depends on your server. Everything not explicitly disallowed is considered fair game to retrieve. Here follow some examples:


When a robot looks for the "/robots.txt" file for URL, it strips the path component from the URL (everything from the first single slash), and puts "/robots.txt" in its place.

For example, for "http://www.example.com/shop/index.html, it will remove the "/shop/index.html", and replace it with "/robots.txt", and will end up with "http://www.example.com/robots.txt".

So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software.

Remember to use all lower case for the filename: "robots.txt", not "Robots.TXT.

http://www.robotstxt.org/robotstxt.html
0
 

Author Closing Comment

by:melwong
ID: 37847781
thx
0

Featured Post

Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
The Best Website Creation Tools 14 73
move expression web site to a new server 13 53
PHP Installer 5 37
Show tab when enter div and change underline color 3 63
If you don’t want your company's site to fail on the web, you’d do well to observe these best web design practices and make sure you implement them when applicable.
FAQ pages provide a simple way for you to supply and for customers to find answers to the most common questions about your company. Here are six reasons why your company website should have a FAQ page
Use Wufoo, an online form creation tool, to make powerful forms. Learn how to choose which pages of your form are visible to your users based on their inputs. The page rules feature provides you with an opportunity to create if:then statements for y…
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question