Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Using Robots.txt with Add-on Domains and Subdomains

Posted on 2011-02-20
5
Medium Priority
?
552 Views
Last Modified: 2013-12-09
Hi - I have seen a lot of similar answers but none that completely convinces me that what I'm doing is right.

I have a master account for mydomain.com.

Within mydomain.com i have created a number of subdomains for testing purposes and are stored in directories within the root directory for the website. For example, mydomain.com/sub1 and mydomain.com/sub2 are also mapped to subdomains sub1.mydomain.com and sub2.mydomain.com.

Some of these are also mapped to add-on domains like sub3.com and sub4.com. These are very small-scale, low budget websites that are mainly blogs/personal/very small business sites I host for friends, not commercial accounts, so they really can't justify the expense or effort of creating and maintaining their own separate hosting accounts.

I recently discovered that sub3.mydomain.com and sub4.mydomain.com are being indexed by Google (even though it is NOT linked to anywhere as it is a test site under development). We are using Google Wave for discussion of sub4, so it is possible they could have used that semi-private info (disturbing, but another story). Not sure how they would know about sub3.

In the root directory for the website, I added robots.txt (oops, was sloppy before and didn't create it for subs1-20). This includes the lines:

    User-agent: *
    Disallow: /sub1/
    Disallow: /sub2/
    Disallow: /sub3/
    Disallow: /sub4/

Will this effectively prevent mydomain.com/subX and subX.mydomain.com from being indexed while still allowing sub3.com and sub4.com to be indexed (and controlled using additional robots.txt in their root directories)?

Is there anything else I should do? Thanks!
0
Comment
Question by:nhtechgal
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 

Author Comment

by:nhtechgal
ID: 34940370
P.S. I used webmaster tools/Crawler Access to test my robots.txt and the results I got for these subdomains was no different from what I got for my many other subdomains that I have yet to include in robots.txt (and which haven't been indexed by Google for whatever reason). Thanks.
0
 
LVL 29

Accepted Solution

by:
fibo earned 1500 total points
ID: 34945341
 User-agent: *
    Disallow: /sub1/
    Disallow: /sub2/
    Disallow: /sub3/
    Disallow: /sub4/

Will this effectively prevent mydomain.com/subX and subX.mydomain.com from being indexed while still allowing sub3.com and sub4.com to be indexed (and controlled using additional robots.txt in their root directories)?

No, it won't (but leave it however)
Disallow does not say "it is forbidden to index this", it just says "please, be kind enough not to index this"
So any link to your unwanted addresses will still be indexed, even id the robot is smart enough to do not go on its own to these subdirs.

My suggested solution:
In each secondary subdir / site, place a mod-rewrite by htaccess which will solve the issue.
Ie, for sub1, place in /sub1/ an ht access like
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.mydomain\.com
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [L,R=301] 

Open in new window

This should solve your problem. If some "older" pages remain indexed, you will then remove them from the index with webmasters tools.
0
 
LVL 29

Expert Comment

by:fibo
ID: 35110813
Hi, have you been able to test my suggestions?
0
 

Author Closing Comment

by:nhtechgal
ID: 35951254
Had to search for a bit as I'm not a server administrator, but found most of the selections.
0
 
LVL 29

Expert Comment

by:fibo
ID: 35951394
B-) Glad it could help. Thx for the points and grade.
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Australian government abolished Visa 457 earlier this April and this article describes how this decision might affect Australian IT scene and IT experts.
Dramatic changes are revolutionizing how we build and use technology. Every company is automating, digitizing, and modernizing operations. We need a better, more connected way to work together as teams so we can harness the insights from our system…
This video teaches users how to migrate an existing Wordpress website to a new domain.
The viewer will learn how to count occurrences of each item in an array.
Suggested Courses

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question