• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 707
  • Last Modified:

htaccess disallow anything else as images or css with Condition http:via

Hi Experts,

I would like to deny in my htaccess all requests from my cdn providers server to files other than jpg/png/gif/css/js/txt files.

Right now I use:
RewriteCond %{HTTP:VIA} ^.*\.worldcdn\..*$
RewriteRule ^robots\.txt$ robots_cdn77.txt [L]

To present for any request from my cdn provider a different robots.txt file to prevent google indexing my site via the cdn servers and create double content. The problem is was that google indexed my site double/trible times via the cdn urls. (they are cname host entries)

I would like to either add an additional rule or change the above one to avoid also users or site visitors can call my site via the cdn links.

In human words spoken something like:
If request is (from http:via) anything with *worldcdn* and for a file other than jpg/gif/png than give as response not the file but cdn_error.php

so any request from *worldcdn* to http://www.domain.com/site/index.php would be changed into http://www.domain.com/cdn_error.php but a request to http://www.domain.com/images/logo.jpg would normal repond with the logo.jpg file.

BUT important is that this only happens for requests from worldcdn and not for any other visitor.

Thank you for your help in advance
  • 4
  • 2
1 Solution
This doesn't make sense. Your CDN provider doesn't index your site not does it call your site  - your can provides content to visitors to your site.

Are you saying that you want content served by your CDN provider to only Asia when people are on your site and give an error if someone attempts to link directly to your content? This, you force people to your site to see your content?

And you don't want Google to index your content either?
Oliver2000Author Commented:
Hi DrDamnit,

let me try to explain. it does make perfect sense. If somebody (incl. google) use the url cdn.domain.com the CNAME host entry directs him to my cdn provider (cdn77 in this case) and of course they take this request and take the original file from www.domain.com and forward to the user (or google). The CDN in the middle acts like a proxy more or less. the result is that you can not only call any image via this cdn url but also any html or php file which I want to prevent.

The problem is that google indexed now the cdn urls like double content. What I did now is already register the cdn subdomains in google webmaster area and started a removal request for all cdn urls which worked very fast. how ever I want to prevent re indexing of any html or php file.

The main question is actually independent of my cdn or google etc.

How can I write a small rule which just allow certain file extensions if the request comes from a certain server and leave all file extensions for others.

Like IF THE REQUEST COMES FROM ANY SERVER WITH CDN allow only jpg,gif but if the request comes from anybody else allow all files.
Oliver2000Author Commented:
Found the solution myself

RewriteCond %{REQUEST_FILENAME} !\.(gif|png|jpg|jpeg|jfif|bmp|css|js|txt)$
RewriteCond %{HTTP:VIA} ^.*\.worldcdn\..*$
RewriteRule ^ http://www.domain.com%{REQUEST_URI} [R,L=301]

This just take any request which is not for a statis file (the list at the end) and if the request comes from any worldcdn server and forward to original www version.
Easily Design & Build Your Next Website

Squarespace’s all-in-one platform gives you everything you need to express yourself creatively online, whether it is with a domain, website, or online store. Get started with your free trial today, and when ready, take 10% off your first purchase with offer code 'EXPERTS'.

Oliver2000Author Commented:
I've requested that this question be closed as follows:

Accepted answer: 0 points for Oliver2000's comment #a39458362

for the following reason:

I found the solution myself.
Based on what you wrote, all requests come from the cdn because the cdn is the proxy. If that is really how you have it set up, I am not sure you can do what you're asking.

I use two different CDNs (rackspace and Amazon) and do not have it setup this way. The cdn delivers large files via urls (images and video) but the main site delivers the actual html.

In this setup,  I can tell the CDN show images and video only to requests that come from my site.  But in your setup,  all requests are moving through the cdn as a proxy,  which means htaccess would be useless -  as far as apache is concerned, all traffic comes from your proxy, so there is no way to filter.
Oliver2000Author Commented:
I have the same situation. the cdn suppose to deliver only images and static files. I changed in my sites the links to this static content to the cdn urls. the problem was that you can call what ever you want via this cdn urls and there for also php or html files which than would be delivered via cdn. since google got hold of the cdn url they started to index the cdn url and with the solution above this is done now because now the cdn gets only static files.

thank you for your help anyway
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now