Link to home
Start Free TrialLog in
Avatar of fredo783
fredo783

asked on

How to do bulk 301 redirect - mod rewrite - in Apache?

Hi, folks:

I have a Drupal website on an Apache server running cpanel.  I migrated an old website to this new website and I want to redirect a hundred or so pages that no longer exist.  Here's an example:

Old website:
www.mysite.org/states/cities
this contains lots of files like:
www.mysite.org/states/cities/san_frrancisco.php
www.mysite.org/states/cities/los_angeles.html
etc.

Those pages no longer exist.  I want to redirect them all to:
www.mysite.org/cities/city_list

I really don't want to have to code 301 redirects for each page.  I *think* mod rewrite can do this, but I am not really sure.

Since this is a Drupal website, I think the Drupal module path_redirect may do this.  But I want to first determine if I can do this with mod rewrite, because I also want to do this on a non-Drupal site.

Help would be appreciated!

Thank you,

Fred


ASKER CERTIFIED SOLUTION
Avatar of Brad Howe
Brad Howe
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yes,  juniplic is correct, the options are [R=301,L].

Thanks bud,
Hades666
You got it -- thanks for the tip. I'm filing that in the "future use" bucket since I know I'm going to need it. Man, I love E-E -- collaboration is a wonderful thing.
Avatar of fredo783
fredo783

ASKER

Hey, thanks for the info!

Question: do the rules themselves contain regular expressions?  For example, in

RewriteRule ^/states/cities/([a-z]+)$ /cities/city_list [L]  

is the "^/states/cities/([a-z]+)$" a regex?

I need to match city names in the form of:
san-francisco
San_Francisco
etc

Also, are the rules parsed and applied top-down - the first one that matches is used?

Thank you,

Fred
Hi Fred,

The rules themselves are regular expressions, yes. They have a slightly different format than, for example, Perl regular expressions. However, they work nicely when done correctly, and disastrously when not. Test, test, and then test again -- I know this from some very bad experiences :)

First I'd add a NC (no case) to your RewriteRule, like this: [NC,R=301,L] -- that means to ignore case, redirect with a 301, and "this is the last rule, don't apply any more". They are, indeed, parsed top-down, and the [L] stops that when you get a match.

You can probably replace the [a-z]+ with .* to catch not just a-z characters but all characters. That should match the - or the _ and any other punctuation...but as I said before, please test that since I didn't. :)
actually, on second thought, if you are using .* you won't need the NC since it will match any character, not just lowercase a through z.

RewriteRule ^/states/cities/(.*)$ /cities/city_list [L,R=301]

Perhaps that will work?
Hi, Junipllc:

Good suggestions.  

I have existing rules in .htaccess to redirect non-www URLs to www URLs:

  RewriteCond %{HTTP_HOST} ^mysite\.org$ [NC]
  RewriteRule ^(.*)$ http://www.mysite.org/$1 [L,R=301]

I plan to add rules like you just specified after the above rule.

-  Does the rule you specified have to contain the full URL, as in
RewriteRule ^mysite.org/states/cities/(.*)$ /cities/city_list [L,R=301]

Or can it be specified just as you noted.

-  Also, does the L in the initial rule above terminate all further attempts to match rewrite rules - would my subsequent cities rules be missed?

Thanks,

Fred
Hi,

L is uses as a conditional exit as you state. When rules are read TOP to BOTTOM, once one is matched - the mod_rewrite exits and gives the url to the user. Download this to help in hte future.
http://www.addedbytes.com/download/mod_rewrite-cheat-sheet-v2/pdf/

This rule ^mysite.org/states/cities/(.*)$ contains everything after /cities/....... it can be /cities/abc/ed/test.php or /cities/index.html, it doesn't matter.

As per the original rule i posted with the regex ([a-z]+), this mean that it is matching any string of character (0-infinity).

Hope it helps,
Hades666
Ok, great!  Then it appears that I could drop the [L] on the first set of rules and let the subsequent rules be matched.  I want to match both www and non-www URLs, and then catch all of the /states/cities/ files.

For example:

  RewriteCond %{HTTP_HOST} ^mysite\.org$ [NC]
  RewriteRule ^(.*)$ http://www.mysite.org/$1 [R=301]
  RewriteRule ^mysite.org/states/cities/(.*)$ /cities/city_list [L,R=301]
  ...

Does that make sense?  

Thanks for your excellent advice!

Fred
Hi,

The first rule has a condition. So what will happen is if the condition is met, then the next rewriterule will be applied.

Next the second rewrite rule will be applied and then on that one you have an L flag which will stop the mod_rewrite engine. The second rule however doesn't have a condition so it will be applied regardless.

So with that said, I would just remove the ^mysite.org and keep it as ^/states/cities/(.*)$ since the first condition catches www and non-www urls.

Cheers,
Hades666
Thanks very much!  The information looks complete and I will implement the rules in the near future.

Cheers,

Fred
Thanks, folks.  I split points on this - your solutions really helped my understanding.

Fred