fredo783
asked on
How to do bulk 301 redirect - mod rewrite - in Apache?
Hi, folks:
I have a Drupal website on an Apache server running cpanel. I migrated an old website to this new website and I want to redirect a hundred or so pages that no longer exist. Here's an example:
Old website:
www.mysite.org/states/cities
this contains lots of files like:
www.mysite.org/states/cities/san_frrancisco.php
www.mysite.org/states/cities/los_angeles.html
etc.
Those pages no longer exist. I want to redirect them all to:
www.mysite.org/cities/city_list
I really don't want to have to code 301 redirects for each page. I *think* mod rewrite can do this, but I am not really sure.
Since this is a Drupal website, I think the Drupal module path_redirect may do this. But I want to first determine if I can do this with mod rewrite, because I also want to do this on a non-Drupal site.
Help would be appreciated!
Thank you,
Fred
I have a Drupal website on an Apache server running cpanel. I migrated an old website to this new website and I want to redirect a hundred or so pages that no longer exist. Here's an example:
Old website:
www.mysite.org/states/cities
this contains lots of files like:
www.mysite.org/states/cities/san_frrancisco.php
www.mysite.org/states/cities/los_angeles.html
etc.
Those pages no longer exist. I want to redirect them all to:
www.mysite.org/cities/city_list
I really don't want to have to code 301 redirects for each page. I *think* mod rewrite can do this, but I am not really sure.
Since this is a Drupal website, I think the Drupal module path_redirect may do this. But I want to first determine if I can do this with mod rewrite, because I also want to do this on a non-Drupal site.
Help would be appreciated!
Thank you,
Fred
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
You got it -- thanks for the tip. I'm filing that in the "future use" bucket since I know I'm going to need it. Man, I love E-E -- collaboration is a wonderful thing.
ASKER
Hey, thanks for the info!
Question: do the rules themselves contain regular expressions? For example, in
RewriteRule ^/states/cities/([a-z]+)$ /cities/city_list [L]
is the "^/states/cities/([a-z]+)$ " a regex?
I need to match city names in the form of:
san-francisco
San_Francisco
etc
Also, are the rules parsed and applied top-down - the first one that matches is used?
Thank you,
Fred
Question: do the rules themselves contain regular expressions? For example, in
RewriteRule ^/states/cities/([a-z]+)$ /cities/city_list [L]
is the "^/states/cities/([a-z]+)$
I need to match city names in the form of:
san-francisco
San_Francisco
etc
Also, are the rules parsed and applied top-down - the first one that matches is used?
Thank you,
Fred
Hi Fred,
The rules themselves are regular expressions, yes. They have a slightly different format than, for example, Perl regular expressions. However, they work nicely when done correctly, and disastrously when not. Test, test, and then test again -- I know this from some very bad experiences :)
First I'd add a NC (no case) to your RewriteRule, like this: [NC,R=301,L] -- that means to ignore case, redirect with a 301, and "this is the last rule, don't apply any more". They are, indeed, parsed top-down, and the [L] stops that when you get a match.
You can probably replace the [a-z]+ with .* to catch not just a-z characters but all characters. That should match the - or the _ and any other punctuation...but as I said before, please test that since I didn't. :)
The rules themselves are regular expressions, yes. They have a slightly different format than, for example, Perl regular expressions. However, they work nicely when done correctly, and disastrously when not. Test, test, and then test again -- I know this from some very bad experiences :)
First I'd add a NC (no case) to your RewriteRule, like this: [NC,R=301,L] -- that means to ignore case, redirect with a 301, and "this is the last rule, don't apply any more". They are, indeed, parsed top-down, and the [L] stops that when you get a match.
You can probably replace the [a-z]+ with .* to catch not just a-z characters but all characters. That should match the - or the _ and any other punctuation...but as I said before, please test that since I didn't. :)
actually, on second thought, if you are using .* you won't need the NC since it will match any character, not just lowercase a through z.
RewriteRule ^/states/cities/(.*)$ /cities/city_list [L,R=301]
Perhaps that will work?
RewriteRule ^/states/cities/(.*)$ /cities/city_list [L,R=301]
Perhaps that will work?
ASKER
Hi, Junipllc:
Good suggestions.
I have existing rules in .htaccess to redirect non-www URLs to www URLs:
RewriteCond %{HTTP_HOST} ^mysite\.org$ [NC]
RewriteRule ^(.*)$ http://www.mysite.org/$1 [L,R=301]
I plan to add rules like you just specified after the above rule.
- Does the rule you specified have to contain the full URL, as in
RewriteRule ^mysite.org/states/cities/ (.*)$ /cities/city_list [L,R=301]
Or can it be specified just as you noted.
- Also, does the L in the initial rule above terminate all further attempts to match rewrite rules - would my subsequent cities rules be missed?
Thanks,
Fred
Good suggestions.
I have existing rules in .htaccess to redirect non-www URLs to www URLs:
RewriteCond %{HTTP_HOST} ^mysite\.org$ [NC]
RewriteRule ^(.*)$ http://www.mysite.org/$1 [L,R=301]
I plan to add rules like you just specified after the above rule.
- Does the rule you specified have to contain the full URL, as in
RewriteRule ^mysite.org/states/cities/
Or can it be specified just as you noted.
- Also, does the L in the initial rule above terminate all further attempts to match rewrite rules - would my subsequent cities rules be missed?
Thanks,
Fred
Hi,
L is uses as a conditional exit as you state. When rules are read TOP to BOTTOM, once one is matched - the mod_rewrite exits and gives the url to the user. Download this to help in hte future.
http://www.addedbytes.com/download/mod_rewrite-cheat-sheet-v2/pdf/
This rule ^mysite.org/states/cities/ (.*)$ contains everything after /cities/....... it can be /cities/abc/ed/test.php or /cities/index.html, it doesn't matter.
As per the original rule i posted with the regex ([a-z]+), this mean that it is matching any string of character (0-infinity).
Hope it helps,
Hades666
L is uses as a conditional exit as you state. When rules are read TOP to BOTTOM, once one is matched - the mod_rewrite exits and gives the url to the user. Download this to help in hte future.
http://www.addedbytes.com/download/mod_rewrite-cheat-sheet-v2/pdf/
This rule ^mysite.org/states/cities/
As per the original rule i posted with the regex ([a-z]+), this mean that it is matching any string of character (0-infinity).
Hope it helps,
Hades666
ASKER
Ok, great! Then it appears that I could drop the [L] on the first set of rules and let the subsequent rules be matched. I want to match both www and non-www URLs, and then catch all of the /states/cities/ files.
For example:
RewriteCond %{HTTP_HOST} ^mysite\.org$ [NC]
RewriteRule ^(.*)$ http://www.mysite.org/$1 [R=301]
RewriteRule ^mysite.org/states/cities/ (.*)$ /cities/city_list [L,R=301]
...
Does that make sense?
Thanks for your excellent advice!
Fred
For example:
RewriteCond %{HTTP_HOST} ^mysite\.org$ [NC]
RewriteRule ^(.*)$ http://www.mysite.org/$1 [R=301]
RewriteRule ^mysite.org/states/cities/
...
Does that make sense?
Thanks for your excellent advice!
Fred
Hi,
The first rule has a condition. So what will happen is if the condition is met, then the next rewriterule will be applied.
Next the second rewrite rule will be applied and then on that one you have an L flag which will stop the mod_rewrite engine. The second rule however doesn't have a condition so it will be applied regardless.
So with that said, I would just remove the ^mysite.org and keep it as ^/states/cities/(.*)$ since the first condition catches www and non-www urls.
Cheers,
Hades666
The first rule has a condition. So what will happen is if the condition is met, then the next rewriterule will be applied.
Next the second rewrite rule will be applied and then on that one you have an L flag which will stop the mod_rewrite engine. The second rule however doesn't have a condition so it will be applied regardless.
So with that said, I would just remove the ^mysite.org and keep it as ^/states/cities/(.*)$ since the first condition catches www and non-www urls.
Cheers,
Hades666
ASKER
Thanks very much! The information looks complete and I will implement the rules in the near future.
Cheers,
Fred
Cheers,
Fred
ASKER
Thanks, folks. I split points on this - your solutions really helped my understanding.
Fred
Fred
Thanks bud,
Hades666