Mod_Rewrite Condition to Redirect External Referrers

We inflate   EPUB's on our web site in advance, save them to e.g.

/public_html/media/books/#some-book-title#/web/ops/xhtml/some-page1.html

We then use LiveCode - RevIgniter to serve these inside an iFrame, inside our site wrapper.. this works great.

BUT  the  problem is Google is finding these pages, indexing and caching them and they are linked to and delivered outside our site wrapper. e.g.

http://dev.himalayanacademy.com/media/books/merging-with-siva/web/ops/xhtml/part_2ch_34a.html

I need to create a mod_rewrite condition that will redirect these to our CMS

in pseudo-code it would look like

IF
( RewriteCond $1 ^(media/books/[any string here, some reg ex I presume]/web/ops)  ) AND (referrer IS NOT localhost)
THEN
RewriteRule ^(.*)$ index.lc?/$1 [L]  
END IF

can you help me with the proper mod_rewrite rule for this?

Thanks!
BrahmanathaAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Steve BinkCommented:
Your pseudo code is very similar to your actual need:
RewriteCond %{HTTP_REFERER} !^localhost$ [NC]
RewriteRule ^/?(media/books/(some regex)/web/ops) /index.lc/$1 [L,NC]

Open in new window


You'll need to be more explicit about (some regex), but that is the general form you want.
0
BrahmanathaAuthor Commented:
Interesting... very simple really.  

"be more explicit about (some regex)" right...

The string ##-###-### that forms the URL segment between "books" and "web"

is a single string with no slashes (no additional segments)

/media/books/  ##-###-## /web/ops

where  #######is the title of a book. The strict convention on our site is: all lower case letters, words words delimited by dashes, underscore at the end followed by ISO 2 ltr language code where not English  e.g.

the-guru-chronicles
dancing-with-siva_es # in spanish

Some works in other languages have their titles in those languages, but the convention is that these strings ("file_id" in our mySql database) are always in the 0-127 range (ASCII) no unicode and no ANSI roman chars such as umlauts etc.  have trouble with figuring out how to make my reEx from being too "greedy") in needs to stop at the next slash in the URL string. How's this for an attempt?

RewriteRule ^/?(media/books/(.*?)/web/ops) /index.lc/$1 [L,NC]

or possibly more explicit?

RewriteRule ^/?(media/books/(.*\/)web/ops) /index.lc/$1 [L,NC]

Where the regex ends at the next slash?
0
Steve BinkCommented:
You're on the right track, but let me see if I can clarify your requirements...

>>> The string ##-###-### that forms the URL segment between "books" and "web"
>>> is a single string with no slashes (no additional segments)

No problem there.  Detection pattern would be /books\/([^/]+)\/web/

>>> The strict convention on our site is: all lower case letters, words words delimited
>>> by dashes, underscore at the end followed by ISO 2 ltr language code where not English  e.g.

Even better..  /books\/([-a-z_]+)\/web/

The final question is, assuming there is a match, do you want the whole URL, or just the matched title?
# for the whole URL
RewriteRule ^/?(media\/books\/[-a-z_]+\/web/ops) /index.lc/$1 [L,NC]

# for just the title
RewriteRule ^/?media\/books\/([-a-z_]+)\/web/ops) /index.lc/$1 [L,NC]

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

BrahmanathaAuthor Commented:
the whole URL... our controller will need to see "/media" at beginning plus the matched title and the final segment, which is the page in the book to target. From there we will use another model/library script to display the page in our views.

Thanks for your help!
0
BrahmanathaAuthor Commented:
Hmmm I had to extend this a bit:

RewriteCond %{HTTP_REFERER} !^localhost$ [NC]
RewriteRule ^/?(media\/books\/[-a-z_]+\/web\/ops\/xhtml\/.*) /index.lc/$1 [L,NC]


and now it works! at least our CMS gets the URL...
0
BrahmanathaAuthor Commented:
Thanks Steve!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Applications

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.