Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Rewriting URLs that end in .htm in htaccess

Posted on 2012-03-20
12
Medium Priority
?
322 Views
Last Modified: 2012-06-22
I am using the following rewriterule:

RewriteRule ^(.+)\.htm[^l]+$ $1.htm [L,NC,QSA,R=301]

I thought it was working to do what I want which is to drop any extra characters after urls that end in .htm unless the next character is an l or a querystring. I just learned that a URL of http://www.romancestuck.com/quotes/movie-quotes.htmCarol doesn't cut off the Carol part and redirect to http://www.romancestuck.com/quotes/movie-quotes.htm How can I change this rewriterule to work?

Thanks!
0
Comment
Question by:webstuck5
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
12 Comments
 
LVL 15

Expert Comment

by:babuno5
ID: 37748483
Here you go

      RewriteRule ^(.*)\.htm(.+)$ /$1.htm [R=301,L,QSA,NC]


Hope the above helps
0
 
LVL 51

Accepted Solution

by:
Steve Bink earned 2000 total points
ID: 37748909
Here is my response from your previous question:

Because there is a lower-case "L" at the end.  Perhaps a small modification:

RewriteRule ^/?(.+)\.htm[^l].*$ $1.htm [L,NC,QSA,R=301]

Open in new window

                                           
The new regex looks for any string that may or may not start with a literal "/" character, followed by a sequence of one or more characters, to be later identified as group 1, followed by the literal string ".htm", followed by any character that is not "L" (case-insensitive), followed by 0 or more characters.
0
 

Author Comment

by:webstuck5
ID: 37749367
I just don't understand why to look for any string that may or may not start with a literal "/" character. What does the "/" character have to do with anything?
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 51

Expert Comment

by:Steve Bink
ID: 37750167
I posted the reasoning for that in the other question.
0
 

Author Comment

by:webstuck5
ID: 37759414
I just noticed that someone is linking to:

http://www.romancestuck.com/quotes/movie-quotes.htmLove

I wonder if there is a better way to write this redirect so that if a URL ends in .htm or .html then everything is cut off after that except for querystrings. I am trying to set up this one redirect to handle as many possible bad URL links as possible.
0
 
LVL 51

Expert Comment

by:Steve Bink
ID: 37759625
The better question is why these requests are coming in.  

Try this:

RewriteRule ^/?(.+)\.(html?).*$ $1.$2 [L,NC,QSA,R=301]

Open in new window

0
 

Author Comment

by:webstuck5
ID: 37760006
Google webmaster tools lists all these bad URL links to my site. Most of the links are meaningless however Google webmaster tools will keep displaying these URL link errors if I don't fix or redirect them. So, I am trying to set up as few redirects as possible to fix them so I don't have to keep seeing these error messages. Sorry to bug you again but your last redirect didn't work the way I was hoping. It redirected:

http://www.romancestuck.com/quotes/movie-quotes.htmLove

to

http://www.romancestuck.com/quotes/movie-quotes.htmL

and showed a message that said it caused too many redirects. I was hoping that it would redirect to the actual URL of:

http://www.romancestuck.com/quotes/movie-quotes.htm

The only page that I have that ends in .html is the main index.html page, every other page ends in .htm

In .htaccess, I now have:

  # REDIRECT URLS ENDING IN index.html
  RewriteRule ^/*index\.html.*$ / [L,NC,QSA,R=301]

So, any redirect after that should actually only need to worry about pages that end with .htm and cut off anything after .htm. Sorry to keep bothering you but I think I am getting to a really good redirect rule that will solve a ton of bad links all at once.
0
 
LVL 51

Expert Comment

by:Steve Bink
ID: 37760934
I think you're going about this the wrong way.  Using the webmaster tools, you should be removing these bad links.  Alternatively, have Google re-index your site.  Right now, you're addressing a symptom, not the problem itself, and that rarely does what you want it to do.

RewriteCond %{REQUEST_FILENAME} !index.html$
RewriteRule ^/?(.+)\.htm.+$ /$1.htm [NC,QSA,R=301]

Open in new window

0
 

Author Comment

by:webstuck5
ID: 37761030
These are links from other sites so I can't remove them and most are things like forum posts so I can't really write the webmaster to have them update the links. The problem is that a lot of people apparently don't know how to create web links, but I don't think I can do much about that. :) So in my .htaccess, I should try:

# REDIRECT URLS ENDING IN index.html
RewriteRule ^/*index\.html.*$ / [L,NC,QSA,R=301]

RewriteCond %{REQUEST_FILENAME} !index.html$
RewriteRule ^/?(.+)\.htm.+$ /$1.htm [NC,QSA,R=301]

Correct?
0
 
LVL 51

Expert Comment

by:Steve Bink
ID: 37761243
While your explanation is true, the burden for proper linking is on the one creating the link.  We all wish people would do everything right every time, but, people being people, that doesn't happen.  You're trying to account for an infinite range of errors, and that just is not going to happen.  You'll find (eventually) that it is a monumental waste of time to try.  

The first rule is liable to create a redirection loop, depending on how your DefaultDocument directive is set.  I would remove it, but if you are determined to keep it, make sure you use the [NS] modifier to prevent the loop.  Also, the first atom should be "/?".

Otherwise, that looks correct.
0
 

Author Comment

by:webstuck5
ID: 37761304
What I am trying to do is redirect as many as possible of these bad links with as few rewriterules as possible. A lot I need to setup individual redirects. I really don't have a choice. I can ignore these bad links listed in Google webmaster tools but then I have to go through them to find the bad links that actually need to be fixed. I can mark these bad links as fixed and have them come back up when the Google crawler finds them again, but that seems silly. So, I thought redirecting these bad links with as few as redirects as possible would be the best option.
0
 

Author Comment

by:webstuck5
ID: 37762115
I am using:

 REDIRECT URLS ENDING IN index.html
RewriteRule ^/?/*index\.html.*$ / [L,NC,NS,QSA,R=301]

# REDIRECT ANY URLS THAT DON'T CONTAIN index.html AND END IN .htm THAT ARE NOT FOLLOWED A querystring
  RewriteCond %{REQUEST_FILENAME} !index.html$
  RewriteRule ^/?(.+)\.htm.+$ /$1.htm [NC,QSA,R=301]

It seems to be working well. Thanks for all your help again!
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Over the last year I have answered a couple of basic URL rewriting questions several times so I thought I might as well have a stab at: explaining the basics, providing a few useful links and consolidating some of the most common queries into a sing…
If your site has a few sections that need to be secure when data is transmitted between the server and local computer, such as a /order/ section for ordering or /customer/ which contains customer data, etc it would of course be recommended to secure…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…
Suggested Courses

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question