Trying to redirect 404 errors using htaccess but problems

Zoo71
Zoo71 used Ask the Experts™
on
Hi Experts

We are inundated with 404 errors from a previous prestashop module and need to redirect these odd pages, 2000 of them to the home page.  

/37-XXXXXXXX-uk/availability-in_stock/bottle_size-10ml/flavour-sweet/nic_strength-6mg_ml_06?&p=4

What I am trying to do is isolate the directory I created this:-  

RewriteCond %{REQUEST_URI} 37-XXXXXXXX-uk/availability-in_stock
RewriteRule . http://www.domain.com [L]

But it appears to create a new page http://www.domain.com/?&p=4
which I think may not be good.  Is there any way just to land it on the home page itself?

Here is the htaccess

# ~~start~~ Do not remove this comment, Prestashop will keep automatically the code outside this comment when .htaccess will be generated again
# .htaccess automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule . - [E=REWRITEBASE:/]
RewriteRule ^api/?(.*)$ %{ENV:REWRITEBASE}webservice/dispatcher.php?url=$1 [QSA,L]

# Images
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$1$2$3.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$1$2$3$4.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$1$2$3$4$5.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$1$2$3$4$5$6.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$1$2$3$4$5$6$7.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$1$2$3$4$5$6$7$8.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$7/$1$2$3$4$5$6$7$8$9.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$7/$8/$1$2$3$4$5$6$7$8$9$10.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^c/([0-9]+)(\-[\.*_a-zA-Z0-9-]*)(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/c/$1$2$3.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^c/([a-zA-Z-]+)(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/c/$1$2.jpg [L]

# Dispatcher
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^.*$ - [NC,L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^.*$ %{ENV:REWRITEBASE}index.php [NC,L]
</IfModule>

#If rewrite mod isn't enabled
ErrorDocument 404 /index.php?controller=404

# ~~end~~ Do not remove this comment, Prestashop will keep automatically the code outside this comment when .htaccess will be generated again


RewriteCond %{HTTP_HOST} ^domain.com
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]


Any help gratefully received

Cheers
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Distinguished Expert 2017

Commented:
There is a custom error redirect
ERRORDOCUMENT 404 http://www.yoursite.com

http://httpd.apache.org/docs/current/custom-error.html
Most Valuable Expert 2011
Top Expert 2016

Commented:
This has worked for me.  I have a custom error handler script.
ErrorDocument 404 /404handler.php

You might try:
ErrorDocument 404 /index.php

Author

Commented:
Hi sorry I dont understand.  How does a ErrorDocument 404 /index.php redirect old urls from specific directories to the home page?

You mean redirect all 404 errors to the home page?  Could that cause a duplicate content issue?
C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

Most Valuable Expert 2011
Top Expert 2016
Commented:
OK, technically there is no such thing as an "old" URL.  There are URLs that point to resources (web pages) and URLs that point to non-existent resources.  When a client makes a request for a URL that does not exist, the server is able to call your 404 handler script or otherwise redirect the browser to some kind of error page.

A well-thought-out 404 handler might inspect the request (looking at cookies, URL parameters, the USER_AGENT, etc.) and decide to redirect with a 301 header to tell the browser or spider that the resource has moved permanently.  It might take a bit of programming to translate old product codes to new product codes, but this would be one way to address the issue.

An easy, but slightly ham-fisted approach would be to redirect to the home page.  At least the client would be able to see something nicer than an error page.  Your SEO won't be helped by this, especially if your site has 2000 bad links.  But if the links are from external web sites to your site, eventually the world will get the message that you've reorganized.

Does that make sense?

Author

Commented:
Hi

Thank you for the explanation, it does make sense.  But instead of creating one 404 handler which is way beyond my abilities I was going to manually redirect what I can and the rest will go to the home page with a lot of the code below for each variance.

RewriteCond %{REQUEST_URI} 37-XXXXXXXX-uk/availability-in_stock
RewriteRule . http://www.domain.com [L]

but it still takes the page name across which I do not want as that just creates new pages.  Is there a way to do that?
Distinguished Expert 2017

Commented:
A redirect, is not what you want.

Create an HTML with a meta redirect to the home page. Error404.html
http://www.w3.org/TR/WCAG-TECHS/H76.html

A 0 means the user will not see the message.

That has content of:  The page you are trying to access is no longer avaialble.  Please make sure your link is to http://www.yourdomain.com.
You will be automatically redirected in 5 seconds. Or you can <a href=http://www.yourdomain.com>click here<\a>

You then would use the examples above
ERRORDOCUMENT 404 /Error404.html
Most Valuable Expert 2011
Top Expert 2016

Commented:
@arnold: I think that's a pretty good solution, but I still think it would be wise to send the "moved permanently" header.  You do not have to send the "location" header.  I believe that at some level the meta-refresh tag is deprecated, but all browsers currently support it and probably always will.  This is what one might look like:

<meta http-equiv="refresh" content="3;URL='/'" />    

That would be saying, "Wait three seconds (so the client can read the message on the screen) then redirect to the home page."
Distinguished Expert 2017

Commented:
Ray,
To whom do you think the header is being sent and how is that interpreted/treated?
A site that has 2000 bad links ...

Thought a conditionalprefirect might be a way to handle, but that would have to be based on a root of references that no longer exist.

I.e. /some path/something that is no longer available.
The problem lies with pages that are all over the place in which case custom error handling is the only way to go.  Note that users with IE and friendly error messages enabled, might not see anything given any error handling still responds to the browser with the HTTP status of 404.

Author

Commented:
Hi Guys

This has totally confused me over the best method to use.  To tell the truth I thought it would be a case of tweaking the code I used.
Distinguished Expert 2017

Commented:
You can continue using your redirect method by processing the log and extracting the common patterns for the 404 entries, but you have to make sure after the patterns are built that there are no valid entries there before implementing.

The custom error doc is the straight forward approach to use and that is why it is there for.

Author

Commented:
Thank you.  Is there a way I can stop the old page name being carried across to the new page, eg the "?&p=4" bit at the end
Distinguished Expert 2017
Commented:
Presumably you are talking about a redirect, you need in your pattern match, exclude the ?.*$
/(.*)\?.*$ http://www.yourdomain.com/$1

A redirect deals ....

To locate and create a redirect for every page would be the same effort of creating the individual missing pages with the notice that the page is no longer valid with ....

Your process of setting up redirects is a never ending situation as it is inevitable that as the site changes, so will some pages go away requiring additional redirects.
A custom 404 error handler or a straight up redirect to the home page covers current and future missing pages.

Author

Commented:
Thank you both very much for your advice.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial