Link to home
Start Free TrialLog in
Avatar of beer9
beer9Flag for India

asked on

How RewriteEngine works in Apache web server?

I have following in my httpd.conf file in a web server

<VirtualHost *:80>
#stand-alone instances
include conf/my_application.conf
RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [R=301,L,NE]
</VirtualHost>

Open in new window


I would like to understand what it does?
Avatar of Steve Bink
Steve Bink
Flag of United States of America image

See my comments:
# Declare a virtual host answering to any domain name on port 80<VirtualHost *:80>
#stand-alone instances
# include the file conf/my_application.conf.  Any directives found in that file will apply to this 
# scope as if they were written here individually
include conf/my_application.conf
# Turn on the rewrite engine
RewriteEngine On
# The next three directives make up a single rewrite
# the RewriteRule is matched first.  If it is matched, the two RewriteCond directives are
# tested.  If the conditions pass, the request is finally rewritten
#
# the request is on port 80 
RewriteCond %{SERVER_PORT} 80
# the requested host is not an empty string
RewriteCond %{HTTP_HOST} !^$
# for every request, capture the entire content of the request string
# and forward to the requested host with the same request string
# the forward is done as a 301 redirect, with no URL escaping of the
# string, and this will be the last rewrite processed
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [R=301,L,NE]
# end the virtual host definition
</VirtualHost>

Open in new window


Note that this rewrite does nothing useful, and will likely result in an endless redirection loop.

You can read more about rewrites in the Apache docs.
Avatar of beer9

ASKER

Thanks for the help Steve,

On Line number 9 you mention RewriteRule is matched first but we have declared 2 RewriteCond first then 1 RewriteRule.

Also we we are starting rewrite engine with 'RewriteEngine On' and now how are are stopping it?

Could you please give a example URL and help me to understand how it flows in such rule? Thanks :-)
RewriteEngine stops where section (Like VirtualHost) ends.

Since your rewrite rule runs on http host checking for port 80 is moot. You can as well replace all rewrite rules with one line
Redirect / https://site_name_.com/
>>> you mention RewriteRule is matched first but we have declared 2 RewriteCond first then 1 RewriteRule.

That is correct.  Take a look at mod_rewrite's flow diagram.  Note that the rule's pattern is matched, then conditions are processed, then the rewrite is applied.

>>> we are starting rewrite engine with 'RewriteEngine On' and now how are are stopping it?

That question is a little ambiguous.  The RewriteEngine directive simply tells mod_rewrite to process any rules within this scope.  You can also use "RewriteEngine Off" to force mod_rewrite to ignore any rules.  If mod_rewrite is processing rules, i.e., "RewriteEngine On", the every rule in the request's scope is applied to each request.  The application of rules ends when one of the following conditions is met:
there are no more rules,
the request is redirected externally, either on a successful rule application with a [R] modifier, or a redirect to an external URL,
a rule with the [L] or [END] modifiers are successfullt applied

Also, as gheist mentioned, the ruleset you posted is really just a return to the current request.  I was incorrect in my earlier assertion that it would result in a redirect loop...the rule is redirecting to the https version of the same request.  This rule enforces https on all requests:

1) Apache receives the request for http://mydomain.com/page.htm
2) mod_rewrite matches "/page.htm" to the rule's ^(.*)$ pattern
3) mod_rewrite passes the condition %{SERVER_PORT} = 80
4) mod_rewrite passes the condition %{HTTP_HOST} != <blank string>
5) mod_rewrite redirects the request to https://mydomain.com/page.htm with a 301.
2) would match any request
3) would match all requests on that virtual host
4) will be always true again on the virtualhost context
5) Actually rewrite effectively in this place is
Redirect perm / https://%{HTTP_HOST}/

Sure it is nice to dig some config snipplets on the internet, but what is the value of them that you do 3 NOOP checks?
Avatar of beer9

ASKER

So when I access http://client.myapplication.com then I see it redirects to the address bar as https://client.myapplication.com/myapplication then again it redirects as https://client.myapplication.com/myapplication/web/login

All of this is happening due to below config?

# Declare a virtual host answering to any domain name on port 80<VirtualHost *:80>
#stand-alone instances
# include the file conf/my_application.conf.  Any directives found in that file will apply to this 
# scope as if they were written here individually
include conf/my_application.conf
# Turn on the rewrite engine
RewriteEngine On
# The next three directives make up a single rewrite
# the RewriteRule is matched first.  If it is matched, the two RewriteCond directives are
# tested.  If the conditions pass, the request is finally rewritten
#
# the request is on port 80 
RewriteCond %{SERVER_PORT} 80
# the requested host is not an empty string
RewriteCond %{HTTP_HOST} !^$
# for every request, capture the entire content of the request string
# and forward to the requested host with the same request string
# the forward is done as a 301 redirect, with no URL escaping of the
# string, and this will be the last rewrite processed
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [R=301,L,NE]
# end the virtual host definition
</VirtualHost>

Open in new window

Avatar of beer9

ASKER

Steve, How apache checks if 'URI changed' as mentioned in your below diagram: http://httpd.apache.org/docs/current/rewrite/tech.html#InternalRuleset

Thanks for helping
Can you post output of httpd -V ?
You refer to the documentation of development version of apache. That document is not relevant for any usable version of apache of today.
@gheist
>>> You refer to the documentation of development version of apache.

That document is for Apache 2.4, which is the most up-to-date production version of Apache.  The Apache 2.2 version of the same document shows the same overall flow, but with a little less detail.

All of my links are pointing to v2.4 documentation.  To see the v2.2 equivalent page, change the word "current" to "2.2" in the respective link.


@beer9
>>> How apache checks if 'URI changed' as mentioned in your below diagram

You would have to check the source code for Apache or mod_rewrite to be sure, but I imagine it is a simple string comparison.  The URI is generally changed if a rewrite is successfully applied.  The final branch of that decision making simply determines if:

a) no rewrites were done, in which case, serve the originally requested resource,
b) a rewrite was successful, and the new URI points to a resource on the same server.  In this case, re-submit the new URI into Apache's request chain,
c) a rewrite was successful, and the new URI points to an external location.  In this case, Apache returns a 3xx response to the client telling them to go elsewhere.
Avatar of beer9

ASKER

Thanks for the help :-)

Does rewrite and redirect add any HTTP header to request/reponse? (like X-Forwarded-For) ??
ASKER CERTIFIED SOLUTION
Avatar of Steve Bink
Steve Bink
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
No it adds just 301/302 with target address, resulting request to new place  has referer of redirect page.