Solved

Robot.txt

Posted on 2013-06-25
10
609 Views
Last Modified: 2013-06-25
Hi,

Can somebody please explain what this is telling google to do in the Robot.txt

User-agent: *
Disallow: /Blog/
Allow: /Blog/post

Is it valid?

Would (notice the line switch)

User-agent: *
Allow: /Blog/post
Disallow: /Blog/

be better?
0
Comment
Question by:nutnut
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 42

Expert Comment

by:sedgwick
ID: 39274116
user-agent * -> An entry that applies to all bots
Disallow: /Blog/ -> URLs matching /Blog/ would be disallowed for Googlebot (and everything in this directory).
Allow: /Blog/post -> URLs matching /Blog/post would be allowed for Googlebot (and everything in this directory).

the answer depends of your requirement, if u want to allow blog/post but disallow blog then use the 1st case, other wise use the 2nd one.
0
 
LVL 9

Expert Comment

by:TvMpt
ID: 39274117
Both valid but in order to be compatible to all robots it is necessary to place the Allow directive(s) first, followed by the Disallow, for example:

User-agent: *
Allow: /Blog/post
Disallow: /Blog/
0
 
LVL 42

Expert Comment

by:sedgwick
ID: 39274118
0
The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

 
LVL 9

Expert Comment

by:TvMpt
ID: 39274119
The order is only important to robots that follow the standard; in the case of the Google or Bing bots, the order is not important.
0
 
LVL 22

Expert Comment

by:Om Prakash
ID: 39274120
Here is a reference on robots.txt
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449

in this case, only the URLs matching /Blog/ would be disallowed for Googlebot.

User-agent: *
Disallow: /Blog/
Allow: /Blog/post
0
 

Author Comment

by:nutnut
ID: 39274131
Hi thanks for the responses.

So I need to

Allow:
www.mysite.com
www.mysite.com/Blog/post ...and then everything below it.

Disallow:
www.mysite.com/Blog/ ..everything underneath EXCEPT www.mysite.com/Blog/post

How would I do that in a Robot.txt

Reason is that google is seeing massive duplication of my site due to tags

So,

/Blog?page=1 and /Blog?page=2 & /Blog?page=3 are seen as the same
/Blog?Tag=BlahBlah is seen /Blog?Tag=DohDoh are seen as the same.  

I just want google to read my main site www.mysite.com and then under read nothing under  www.mysite.com/Blog EXCEPT www.mysite.com/Blog/post ..and below.

Is there a better way to do this outside of Robot.txt.  All post under www.mysite.com/Blog/post so for example www.mysite.com/Blog/post/My-Blog-Post.aspx is Canonicalized fine.

Thanks
0
 
LVL 9

Expert Comment

by:TvMpt
ID: 39274224
There is no problem to use

User-agent: *
Disallow: /blog?Tag
Disallow: /blog?page
0
 

Author Comment

by:nutnut
ID: 39274237
How about

User-agent: *
Allow: /Blog/post
Allow: /Blog/post/
Disallow: /Blog
Disallow: /Blog/

Is this ok syntax wise
0
 
LVL 9

Accepted Solution

by:
TvMpt earned 500 total points
ID: 39274255
Example:
If you end with a "/" then it will specify that as the match.
That means this;
   Disallow: /wp-includes/
will block these;
   Disallow: /wp-includes/this.html
   Disallow: /wp-includes/that.php
   Disallow: /wp-includes/thisstoo.jpg
   Disallow: /wp-includes/here/here2/anythinginhere.aswell
etc.

If you use this;
   Disallow: /wp-includes
(without the / at the end)
then it would not only block the above, but also
   Disallow: /wp-includes-this
   Disallow: /wp-includesplusthis
   Disallow: /wp-includes-thistoo-andthebelow
   Disallow: /wp-includess

Open in new window


Did you see the difference? :)

Try using the robots.txt testing tool in Google WebMaster Tools.
0
 

Author Closing Comment

by:nutnut
ID: 39274262
Thanks very much. I have gone for

User-agent: *
Allow: /Blog/post/
Disallow: /Blog
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of the typical problems I have experienced is when you have to move a web server from one hosting site to another. You normally prepare all on the new host, transfer the site, change DNS and cross your fingers hoping all will be ok on new server…
As tax season makes its return, so does the increase in cyber crime and tax refund phishing that comes with it
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
Attackers love to prey on accounts that have privileges. Reducing privileged accounts and protecting privileged accounts therefore is paramount. Users, groups, and service accounts need to be protected to help protect the entire Active Directory …

713 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question