Solved

Robots.txt -- I only want to allow access to the public root directory

Posted on 2006-10-31
5
464 Views
Last Modified: 2010-05-19
WHat is the best way to write a robots.txt file so that robots will only have access to the webite's main public root directory.  In this case the root is named public_html

THanks

Rowby
0
Comment
Question by:Rowby Goren
5 Comments
 
LVL 8

Expert Comment

by:radnor
ID: 17843859
0
 
LVL 3

Assisted Solution

by:jsev1995
jsev1995 earned 100 total points
ID: 17846049
You have to have a seperate disallow line for each folder to protect, no way to do it with a whitelist approach.

User-agent: *
Disallow: /folder/
Disallow: /folder2/

Please note that the / is relative to the root of the domain, NOT THE SERVER. Nobody but others on your server will have access to any path outside of you public_html folder. DO NOT EVER INCLUDE IT IN A LINK!
0
 
LVL 1

Accepted Solution

by:
austerhaus earned 400 total points
ID: 17847884
This site has (I think) a better description of the format and parameters for the robots.txt file. Remember, like jsev1995 indicated, robots.txt works like a "blacklist" rather than a "whitelist". Meaning that in order to disallow access to specific directories you must explicitly define each directory in robots.txt. It is not possible to say: "give access only to this directory". Instead, your rules must say: "block access to every directory in this list".

http://www.robotstxt.org/wc/exclusion-admin.html

If possible, you may want to put all of your "unlisted" folders inside a parent directory then simply block that parent directory in robots.txt. For example, if you structure the root directory like:
www.yoursite.com/
    -robots.txt
    -/alloweddirectory
         -/subdirectory1
         -/subdirectory2
         -/...
    -/blockeddirectory
         -/subdirectory1
         -/subdirectory2
         -/...

...and the contents of robots.txt looks like:
User-agent: *
Disallow: /blockeddirectory/

...this should allow search engines to access everything in "alloweddirectory" and deny access to everything in "blockeddirectory".

0
 
LVL 3

Expert Comment

by:jsev1995
ID: 17848995
depending on wether or not your site is already coded, moving the files will screw up all your links. So that might not work.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 17849199
Thanks

Now I know *all about robots*

Rowby
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
Any business that wants to seriously grow needs to keep the needs and desires of an international audience of their websites in mind. Making a website friendly to international users isn’t prohibitively expensive and can provide an incredible return…
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

713 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question