Link to home
Start Free TrialLog in
Avatar of derek2277
derek2277

asked on

.htaccess 301 redirect converting + (plus sign) to %2520

I am using the included .htaccess snippet below and for some reason the URI is being converted into %2520 instead of a plus sign.  I'm at a loss for reason behind this... Please let me know if I need to include more info.
RewriteCond %{QUERY_STRING} ^query=([^&]+)&search=1$
RewriteRule ^search\.php$ /%1.html? [R=301,L]
RewriteCond %{QUERY_STRING} ^query=([^&]+)&search=1&start=([0-9]+)$
RewriteRule ^search\.php$ /%2/%1.html? [R=301,L]
 
RewriteRule ^([0-9]+)/([^/]+).html$ /search.php?query=$2&search=1&start=$1&a=1 [L]
RewriteRule ^(.*).html$ /search.php?query=$1&search=1&a=1 [L]

Open in new window

Avatar of caterham_www
caterham_www
Flag of Germany image

Interesting... because I can't reproduce this with apache 2.2.9. Not with a query string being rewritten containing a + to an url-path and external redirect, nor an url-path  rewritten to a query string.

Are you on a unix system? There's sometimes sthg. special on windows.
Avatar of derek2277
derek2277

ASKER

Hi caterham,

When you say a unix system I am assuming that you mean my OS?  If so I am on Windows XP.  If not, I am running Apache 2.0 on my server.  The main problem is that google is indexing my urls with the %2520 in the path... everything still works correctly with my script even with the %2520 in the urls, but it is strange that this is happening... Is it possible to change the + sign to a - from within the htaccess file and not cause any major issues?  If so, how would you do it?  I'm not sure if that would even help, but I am just curious.

Thanks for responding! :-)
I tried to reproduce this on Windows XP and apache 2.0.55, but it seems to work as expected.

> to change the + sign to a - from within the htaccess file

That is not a good idea to do so because of certain bugs with the N flag if path_info is present. It would be better if you have access to the httpd.conf and could set-up a RewriteMap with a perl script which changes specific characters and returns a string to the RewriteRule.

If you have access to the httpd.conf, you may set-up a RewriteLog first with

RewriteLog logs/rewrite.log
RewriteLogLevel 5

at the bottom of httpd.conf (and restart the apache httpd service).
Rewritelog:
strip per-dir prefix: C:/.../htdocs/foo -> foo
applying pattern '^foo' to uri 'foo'
RewriteCond: input='q=b+a+r' pattern='q=(.*)' => matched
rewrite foo -> /bar/b+a+r
explicitly forcing redirect with http://localhost:81/bar/b+a+r
escaping http://localhost:81/bar/b+a+r for redirect
redirect to http://localhost:81/bar/b+a+r?q=b+a+r [REDIRECT/301]

Open in new window

Ok done.  Within minutes after implementing this my log file is HUGE (since I use rewrite throughout most of my websites).  Is there a way to easily view this file?

Thanks.
Also, sorry I have never used this type of log before, what exactly am I looking for?
We're looking for the processing of your first or second rewrite rule posted above.

may be request a unique string like

/search.php?query=foo+bar+to+test+plus&search=1

and search for foo within the logfile to check the processing if the + is now a space or %20


> since I use rewrite throughout most of my websites

If you're using virtual hosts, put both directives which enable the log into that <virtualhost>, that limits the logging at least to that specific virtual host.
Here is my test query. query entered is "test this string" (without quotes):

68.111.77.174 - - [18/Aug/2008:16:12:18 --0700] [www.mysite.com/sid#9c31ba0][rid#9e7edf8/initial] (3) [per-dir /home2/mysite/public_html/] applying pattern '^search\.php$' to uri 'search.php'
68.111.77.174 - - [18/Aug/2008:16:12:18 --0700] [www.mysite.com/sid#9c31ba0][rid#9e7edf8/initial] (4) RewriteCond: input='query=test+this+string&search=1' pattern='^query=([^&]+)&search=1$' => matched
68.111.77.174 - - [18/Aug/2008:16:12:18 --0700] [www.mysite.com/sid#9c31ba0][rid#9e7edf8/initial] (2) [per-dir /home2/mysite/public_html/] rewrite search.php -> /myspace-layouts/test+this+string.html?
68.111.77.174 - - [18/Aug/2008:16:12:18 --0700] [www.mysite.com/sid#9c31ba0][rid#9e7edf8/initial] (3) split uri=/myspace-layouts/test+this+string.html? -> uri=/myspace-layouts/test+this+string.html, args=<none>

Looking at more of the log I see this too:

67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9dec208/initial] (4) RewriteCond: input='query=blood%20bandana&search=1' pattern='^query=([^&]+)&search=1$' => matched
67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9dec208/initial] (2) [per-dir /home2/mysite/public_html/] rewrite search.php -> /myspace-layouts/blood%20bandana.html?
67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9dec208/initial] (3) split uri=/myspace-layouts/blood%20bandana.html? -> uri=/myspace-layouts/blood%20bandana.html, args=<none>
67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9dec208/initial] (2) [per-dir /home2/mysite/public_html/] explicitly forcing redirect with http://www.mysite.com/myspace-layouts/blood%20bandana.html
67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9dec208/initial] (1) [per-dir /home2/mysite/public_html/] escaping http://www.mysite.com/myspace-layouts/blood%20bandana.html for redirect
67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9dec208/initial] (1) [per-dir /home2/mysite/public_html/] redirect to http://www.layoutlocator.com/myspace-layouts/blood%2520bandana.html [REDIRECT/301]
67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9df0218/initial] (3) [per-dir /home2/mysite/public_html/] add path info postfix: /home2/mysite/public_html/myspace-layouts -> /home2/mysite/public_html/myspace-layouts/blood%20bandana.html
67.180.189.102 - - [18/Aug/2008:16:12:03 --0700] [www.mysite.com/sid#9c31ba0][rid#9df0218/initial] (3) [per-dir /home2/mysite/public_html/] strip per-dir prefix: /home2/mysite/public_html/myspace-layouts/blood%20bandana.html -> myspace-layouts/blood%20bandana.html

I'm still not exactly sure what to do with this as I am really new to the rewrite.log.  Please let me know if I need to include more information.
It looks to me like the links that have %2520 in them are google links that had the %20 in it before I did the mod_rewrite.  Could that have something to do with it?
Yes, that's caused by a initial request of search.php?query=foo%20bar (instead of a + as space).
We may use the N-flag to change it into a + prior sending a redirect


RewriteCond %{QUERY_STRING} ^(query=.*)%20(.*)
RewriteRule ^search\.php$ search.php?%1+%2 [N]
 
RewriteCond %{QUERY_STRING} ^query=([^&]+)&search=1$
RewriteRule ^search\.php$ /%1.html? [R=301,L]
RewriteCond %{QUERY_STRING} ^query=([^&]+)&search=1&start=([0-9]+)$
RewriteRule ^search\.php$ /%2/%1.html? [R=301,L]
 
RewriteRule ^([0-9]+)/([^/]+)\.html$ /search.php?query=$2&search=1&start=$1&a=1 [L]
RewriteRule ^(.*)\.html$ /search.php?query=$1&search=1&a=1 [L]

Open in new window

So when visiting the link: http://www.layoutlocator.com/myspace-layouts/blood%2520bandana.html should that be redirected to the correct url: http://www.layoutlocator.com/myspace-layouts/blood+bandana.html ?

Currently it does not.

Thanks for your help :-)
No, the log shows a request of search.php?query=blood%20bandana&search=1 which was rewritten to /myspace-layouts/blood%2520bandana.html and which should now be rewritten to /myspace-layouts/blood+bandana.html instead.
One more quick question, then I will be out of your hair!  Is there a way that I can 301 redirect the old %2520 url to the correct url easily and not lose them in google?

Thanks again!
ASKER CERTIFIED SOLUTION
Avatar of caterham_www
caterham_www
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I cannot even begin to tell you how much help you have been to me! Thank you so much caterham_www!!! Much appreciated!
Just FYI: I came to this result via a Google search for trouble with %20 in an .htaccess 301 redirect coming out as %2520. My solution was to replace %20 with a space " " and putting quotes around the url.

For example:
  RedirectMatch 301 ^/old/url$ /new%20url
becomes
  RedirectMatch 301 ^/old/url$ "/new url"