Solved

Robots.txt -- I only want to allow access to the public root directory

Posted on 2006-10-31
5
474 Views
Last Modified: 2010-05-19
WHat is the best way to write a robots.txt file so that robots will only have access to the webite's main public root directory.  In this case the root is named public_html

THanks

Rowby
0
Comment
Question by:Rowby Goren
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 8

Expert Comment

by:radnor
ID: 17843859
0
 
LVL 3

Assisted Solution

by:jsev1995
jsev1995 earned 100 total points
ID: 17846049
You have to have a seperate disallow line for each folder to protect, no way to do it with a whitelist approach.

User-agent: *
Disallow: /folder/
Disallow: /folder2/

Please note that the / is relative to the root of the domain, NOT THE SERVER. Nobody but others on your server will have access to any path outside of you public_html folder. DO NOT EVER INCLUDE IT IN A LINK!
0
 
LVL 1

Accepted Solution

by:
austerhaus earned 400 total points
ID: 17847884
This site has (I think) a better description of the format and parameters for the robots.txt file. Remember, like jsev1995 indicated, robots.txt works like a "blacklist" rather than a "whitelist". Meaning that in order to disallow access to specific directories you must explicitly define each directory in robots.txt. It is not possible to say: "give access only to this directory". Instead, your rules must say: "block access to every directory in this list".

http://www.robotstxt.org/wc/exclusion-admin.html

If possible, you may want to put all of your "unlisted" folders inside a parent directory then simply block that parent directory in robots.txt. For example, if you structure the root directory like:
www.yoursite.com/
    -robots.txt
    -/alloweddirectory
         -/subdirectory1
         -/subdirectory2
         -/...
    -/blockeddirectory
         -/subdirectory1
         -/subdirectory2
         -/...

...and the contents of robots.txt looks like:
User-agent: *
Disallow: /blockeddirectory/

...this should allow search engines to access everything in "alloweddirectory" and deny access to everything in "blockeddirectory".

0
 
LVL 3

Expert Comment

by:jsev1995
ID: 17848995
depending on wether or not your site is already coded, moving the files will screw up all your links. So that might not work.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 17849199
Thanks

Now I know *all about robots*

Rowby
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An enjoyable and seamless user experience can go a long way on an eCommerce site. While a cohesive layout and engaging copy play roles in creating a positive user experience, some sites neglect aspects that seem marginal but in actuality prove very …
Although a lot of people devote their energy toward marketing for specific industries, there are some basic principles that can be applied to any sector imaginable. We’ll look at four steps to take and examine how those steps were put into action fo…
Viewers will get an overview of the benefits and risks of using Bitcoin to accept payments. What Bitcoin is: Legality: Risks: Benefits: Which businesses are best suited?: Other things you should know: How to get started:
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question