New Site - Old 404's/301's

erzoolander
erzoolander used Ask the Experts™
on
We recently updated our website, and now I'm addressing 301's.

A good percent of the 404's are to things that simply existed on the old server...like backups of old email promotions...old PDFs from promotions a few years ago/etc.  We just didn't migrate those files over to the new server.

...and as you might imagine, there are hundreds of these things.

What do you do for 301's when the 404's are for content that is obsolete, not really part of the site, etc?  Do I really need to go into htaccess and figure out an appropriate redirect for every single eblast that I chose not to migrate over, or PDF from some campaign a couple of years ago?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Yes you do. Apache does not know if the 404 is due to obsolete files or programmer error.

You could use a 404 manager, like this one https://github.com/swt83/php-laravel-404
With this code you could handle all 404 errors, check a database for obsolete content, and throw a 404 when really necessary.
Lucas BishopMarketing Technologist
Commented:
If you are receiving 404 errors, then odds are, real people are trying to access these files. That means that not only are those pages likely ranked in Google, but there are people who are having a poor experience when clicking on the links.

If you can afford to turn visitors away, then leaving the 404's in place is fine. However, it sounds like you have a bit of an opportunity available to you that should not be ignored. Redirect these visitors to a useful place. Give them a good experience.

If the old newsletters have a common url path, you could setup a blanket 301 redirect for them, so that any clicks into the newsletter archives are redirected to your news/blog/etc page instead. Then you only have to setup 1 rule in total.

For example, if the url contains "/newsletter/*" then redirect to "/blog"

Using a simple htaccess rule:
RewriteEngine on
RewriteBase /
RewriteRule ^newsletter/(.*)$ /blog/ [R=301,L]

Open in new window


If someone clicks on a link to:
http://example.com/newsletter/some-old-eblast-volume1-issue2.html

They will be redirected to:
http://example.com/blog

Also, if there are any old URLs that actually have a significant amount of 404 visits, you might actually want to consider retrieving what they're looking for, creating a new page that holds the expected content and redirecting to the new page.
If you have a large number of old links without a common path structure, then managing the list in htaccess becomes unwieldy very fast.
A solution might be to have the list of old links and the redirects in a database, and have a piece of code to deal with it. Something like the following php code.
Save it as 404.php and add the following to your .htaccess file :
ErrorDocument 404 /404.php

Open in new window

I left out the database integration as an exercise for the reader..
<?php 

// One way to set values into the $redirect array
$redirect = array (
			"/something/not-existing-stuff" => array(
					"url" => "http://cnn.com/", 
					"status" => 302 ),
			"/otherthing/otherstuff" => array(
					"url" => "http://yahoo.com", 
					"status" => 302 ),						
			); 
// Another way to set values into the $redirect array
$redirect["/yetanother/some.php"]["url"]="http://bbc.co.uk";
$redirect["/yetanother/some.php"]["status"]=302;
// You can also integrate with a database.. 

$request_path = parse_url($_SERVER["REQUEST_URI"], PHP_URL_PATH) ;

if ( isset($redirect[$request_path] )) {
	header('Location: ' . $redirect[$request_path]["url"],
			true, 
			$redirect[$request_path]["status"]);
	// Instruct browsers, proxies and CDN to not cache
	header('Cache-Control: private, no-cache'); 
 
	exit();
}

header('Cache-Control: private, no-cache'); 

?>

<h1>404 - Page not found</h1>
Place your custom HTML here

Open in new window

Marketing Technologist
Commented:
Another option to address the scenario shalomc presents (a large number of old links without a common path structure) can be put together fairly easily with a spreadsheet.

1.) Export the file names of all the 404 errors from your server log
2.) Column A = "Redirect 301"
3.) Column B = The list of exported 404 file names
4.) Column C = File/Directory where people should be redirected
5.) Column D =
=CONCATENATE(A2," ",B2," ",C2)

Open in new window


You can then use 'Column D' to fill out an htaccess file that redirects people the appropriate locations specified in Column C.

The spreadsheet would look something like:
redirect creator
Meanwhile the htaccess file would only contain Column D:
Redirect 301 random-oldfile.html /newfile.html
Redirect 301 another-random-oldfile.html /newfile.html
Redirect 301 yet-another-random-oldfile.html /newfile.html
Redirect 301 blah.html /blog/
Redirect 301 nascetur.html /about-us.html
Redirect 301 id.html /about-us.html
Redirect 301 velit.html /about-us.html
Redirect 301 aliquam.html /about-us.html
Redirect 301 lacinia.html /about-us.html
Redirect 301 sit.html /about-us.html
Redirect 301 morbi.html /about-us.html
Redirect 301 dictum.html /about-us.html
Redirect 301 fermentum.html /about-us.html
Redirect 301 Fusce.html /about-us.html
Redirect 301 Phasellus.html /about-us.html

Open in new window

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial