Solved

robots.txt explained

Posted on 2012-03-12
2
455 Views
Last Modified: 2012-03-13
1) Does every website have a robots.txt file?
2) Is every websites robots.txt file publicly accessible/downloadable - if so from where?
3) What is the point in them, for example if you have an entry in robots.txt for /admin - then if the file is publicly accessible then what has it actually solved? I.e. how are you any better off in hiding /admin from google if someone can download your robots.txt and then see you actually have /admin directory on the server?

I cant see the logic or what the point in adding entries in it are?
0
Comment
Question by:pma111
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 53

Accepted Solution

by:
COBOLdinosaur earned 250 total points
ID: 37710248
No every site does not have one.  

It needs to be accessible so the 'bots can find it. The purpose of robot.txt is to tell the spiders not to index things.  Without direction they will index everything they find.  If you don't want the directory name exposed put it in a higher level folder and deny at the higher level

You prevent the public from accessing the admin or anything else with .htaccess

If you have something that is sensitive it should not be on the web server, because a hacker will always find a way to see it.


Cd&
0
 
LVL 15

Assisted Solution

by:Ess Kay
Ess Kay earned 250 total points
ID: 37710267
1> no
2> yes typically-->   website.com/robots.txt
3> entries entered here are for the webcrawling robots. it will allow/disallow to crawl through certain sections of your website and index them into the search engines such as google, yahoo, bing


if you have certain pages which you dont want to be added to the search engine such as your admin login page, you might want to add it here, so that common folk will not see it when they search for you site



as far as true hackers, the lack of a robots.txt file will not stop them



you dont have to add all of the admin pages, only the first portal

once the robots stops there it will not go further through that page's links
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Why "Mobile First"? 5 40
Replace value 2 35
Level out logo and increase size? 16 28
fillable forms on website 2 8
Get to know the ins and outs of building a web-based ERP system for your enterprise. Development timeline, technology, and costs outlined.
Does your audience prefer people in photos or no people? How can you best highlight what you’re selling? What are your competitors doing, and what can you do that is different and unique from them?  Continue reading to learn how to make your images …
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question