Solved

Robots.txt -- I only want to allow access to the public root directory

Posted on 2006-10-31
5
430 Views
Last Modified: 2010-05-19
WHat is the best way to write a robots.txt file so that robots will only have access to the webite's main public root directory.  In this case the root is named public_html

THanks

Rowby
0
Comment
Question by:Rowby Goren
5 Comments
 
LVL 8

Expert Comment

by:radnor
ID: 17843859
0
 
LVL 3

Assisted Solution

by:jsev1995
jsev1995 earned 100 total points
ID: 17846049
You have to have a seperate disallow line for each folder to protect, no way to do it with a whitelist approach.

User-agent: *
Disallow: /folder/
Disallow: /folder2/

Please note that the / is relative to the root of the domain, NOT THE SERVER. Nobody but others on your server will have access to any path outside of you public_html folder. DO NOT EVER INCLUDE IT IN A LINK!
0
 
LVL 1

Accepted Solution

by:
austerhaus earned 400 total points
ID: 17847884
This site has (I think) a better description of the format and parameters for the robots.txt file. Remember, like jsev1995 indicated, robots.txt works like a "blacklist" rather than a "whitelist". Meaning that in order to disallow access to specific directories you must explicitly define each directory in robots.txt. It is not possible to say: "give access only to this directory". Instead, your rules must say: "block access to every directory in this list".

http://www.robotstxt.org/wc/exclusion-admin.html

If possible, you may want to put all of your "unlisted" folders inside a parent directory then simply block that parent directory in robots.txt. For example, if you structure the root directory like:
www.yoursite.com/
    -robots.txt
    -/alloweddirectory
         -/subdirectory1
         -/subdirectory2
         -/...
    -/blockeddirectory
         -/subdirectory1
         -/subdirectory2
         -/...

...and the contents of robots.txt looks like:
User-agent: *
Disallow: /blockeddirectory/

...this should allow search engines to access everything in "alloweddirectory" and deny access to everything in "blockeddirectory".

0
 
LVL 3

Expert Comment

by:jsev1995
ID: 17848995
depending on wether or not your site is already coded, moving the files will screw up all your links. So that might not work.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 17849199
Thanks

Now I know *all about robots*

Rowby
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit (http://en.wikipedia.org/wiki/PHPUnit) and similar technologies have enjoyed wide adoption, making it possib…
Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
The viewer will get a basic understanding of what section 508 compliance can entail, learn about skip navigation links, alt text, transcripts, and font size controls.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now