Link to home
Start Free TrialLog in
Avatar of Zoo71
Zoo71

asked on

Trying to redirect 404 errors using htaccess but problems

Hi Experts

We are inundated with 404 errors from a previous prestashop module and need to redirect these odd pages, 2000 of them to the home page.  

/37-XXXXXXXX-uk/availability-in_stock/bottle_size-10ml/flavour-sweet/nic_strength-6mg_ml_06?&p=4

What I am trying to do is isolate the directory I created this:-  

RewriteCond %{REQUEST_URI} 37-XXXXXXXX-uk/availability-in_stock
RewriteRule . http://www.domain.com [L]

But it appears to create a new page http://www.domain.com/?&p=4
which I think may not be good.  Is there any way just to land it on the home page itself?

Here is the htaccess

# ~~start~~ Do not remove this comment, Prestashop will keep automatically the code outside this comment when .htaccess will be generated again
# .htaccess automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule . - [E=REWRITEBASE:/]
RewriteRule ^api/?(.*)$ %{ENV:REWRITEBASE}webservice/dispatcher.php?url=$1 [QSA,L]

# Images
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$1$2$3.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$1$2$3$4.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$1$2$3$4$5.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$1$2$3$4$5$6.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$1$2$3$4$5$6$7.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$1$2$3$4$5$6$7$8.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$7/$1$2$3$4$5$6$7$8$9.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])([0-9])(\-[_a-zA-Z0-9-]*)?(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/p/$1/$2/$3/$4/$5/$6/$7/$8/$1$2$3$4$5$6$7$8$9$10.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^c/([0-9]+)(\-[\.*_a-zA-Z0-9-]*)(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/c/$1$2$3.jpg [L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^c/([a-zA-Z-]+)(-[0-9]+)?/.+\.jpg$ %{ENV:REWRITEBASE}img/c/$1$2.jpg [L]

# Dispatcher
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^.*$ - [NC,L]
RewriteCond %{HTTP_HOST} ^www.domain.com$
RewriteRule ^.*$ %{ENV:REWRITEBASE}index.php [NC,L]
</IfModule>

#If rewrite mod isn't enabled
ErrorDocument 404 /index.php?controller=404

# ~~end~~ Do not remove this comment, Prestashop will keep automatically the code outside this comment when .htaccess will be generated again


RewriteCond %{HTTP_HOST} ^domain.com
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]


Any help gratefully received

Cheers
Avatar of arnold
arnold
Flag of United States of America image

There is a custom error redirect
ERRORDOCUMENT 404 http://www.yoursite.com

http://httpd.apache.org/docs/current/custom-error.html
This has worked for me.  I have a custom error handler script.
ErrorDocument 404 /404handler.php

You might try:
ErrorDocument 404 /index.php
Avatar of Zoo71
Zoo71

ASKER

Hi sorry I dont understand.  How does a ErrorDocument 404 /index.php redirect old urls from specific directories to the home page?

You mean redirect all 404 errors to the home page?  Could that cause a duplicate content issue?
SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Zoo71

ASKER

Hi

Thank you for the explanation, it does make sense.  But instead of creating one 404 handler which is way beyond my abilities I was going to manually redirect what I can and the rest will go to the home page with a lot of the code below for each variance.

RewriteCond %{REQUEST_URI} 37-XXXXXXXX-uk/availability-in_stock
RewriteRule . http://www.domain.com [L]

but it still takes the page name across which I do not want as that just creates new pages.  Is there a way to do that?
A redirect, is not what you want.

Create an HTML with a meta redirect to the home page. Error404.html
http://www.w3.org/TR/WCAG-TECHS/H76.html

A 0 means the user will not see the message.

That has content of:  The page you are trying to access is no longer avaialble.  Please make sure your link is to http://www.yourdomain.com.
You will be automatically redirected in 5 seconds. Or you can <a href=http://www.yourdomain.com>click here<\a>

You then would use the examples above
ERRORDOCUMENT 404 /Error404.html
@arnold: I think that's a pretty good solution, but I still think it would be wise to send the "moved permanently" header.  You do not have to send the "location" header.  I believe that at some level the meta-refresh tag is deprecated, but all browsers currently support it and probably always will.  This is what one might look like:

<meta http-equiv="refresh" content="3;URL='/'" />    

That would be saying, "Wait three seconds (so the client can read the message on the screen) then redirect to the home page."
Ray,
To whom do you think the header is being sent and how is that interpreted/treated?
A site that has 2000 bad links ...

Thought a conditionalprefirect might be a way to handle, but that would have to be based on a root of references that no longer exist.

I.e. /some path/something that is no longer available.
The problem lies with pages that are all over the place in which case custom error handling is the only way to go.  Note that users with IE and friendly error messages enabled, might not see anything given any error handling still responds to the browser with the HTTP status of 404.
Avatar of Zoo71

ASKER

Hi Guys

This has totally confused me over the best method to use.  To tell the truth I thought it would be a case of tweaking the code I used.
You can continue using your redirect method by processing the log and extracting the common patterns for the 404 entries, but you have to make sure after the patterns are built that there are no valid entries there before implementing.

The custom error doc is the straight forward approach to use and that is why it is there for.
Avatar of Zoo71

ASKER

Thank you.  Is there a way I can stop the old page name being carried across to the new page, eg the "?&p=4" bit at the end
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Zoo71

ASKER

Thank you both very much for your advice.