How does mod rewriting work with this specific example?

aristanoble
aristanoble used Ask the Experts™
on
Can you explain how mod_rewriting works using this as an example:
Rewrite:
http://www.domain.com/page.php?q=keyword1,keyword2,keywordn&page=x¶m=i

To:
http://www.domain.com/page/keyword1,keyword2,keywordn?page=x¶m=i

Where 'n', 'x', & 'i' may be any integer.'

The purpose is twofold here 1) I need the appropriate solution to that specific example and 2) I want to learn how it  was done. Thanks experts.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Hello,

I'll try to answer this the best I can.  mod_rewrite is designed to masquerade variables in a URL and to make it more search engine friendly.  So your browser will send the request to:

http://www.domain.com/page/key1.key2/1/2/3

Behind the scenes, mod_rewrite will translate it to:

http://www.domain.com/page.php?q=key1.key2&n=1&x=2&i=3

1. So for your appropriate solution:

# for your .htaccess file
RewriteRule ^page/([^/]+)/([^/]+)/([^/]+)/([^/]+) /page.php?q=$1&n=$2&x=$3&i=$4
RewriteRule ^page /page.php

Open in new window


2. How it was done:

Above, what ([^/]+) does is accept any character including integers.  you can probably replace this with (0-9) but if a user tries to manually but a alphabetic character in the url, it may lead to a 500 error.  The other reason to do it this way is because eventually you'll get a user who tries to purposely break the system.

The $1, $2, and $3 and $4 are wildcards.  The rewrite rules tell Apache to parse a phantom directory called 'page' as page.php in your site root folder and that the phantom sub-directories are the variables passed.

Now for your page.php file, you'll need a conditional statement to force integers in the URL, this is just for the first variable (n), you can repeat for the others.  Also, since your keywords are period delimited, be sure the user does not enter a period in the search criteria or at least filter it upon submit.

Hope this helps.

Author

Commented:
Thanks bigeven for your reply. I tried the solution and it didn't work on my system. Please also note that the query string would be comma-separated instead of period separated as you had mentioned.
Most Valuable Expert 2011
Top Expert 2016

Commented:
As a practical matter, the whole concept of URL-rewriting to improve SEO is history.  The search engines are smarter than the so-called SEO experts.  If you want to get a good position in a search, you need to pay the search engine for advertising.  It's that simple.  If you're unable or unwilling to pay for advertising, you're out of business, full stop.  Your site's listing will languish somewhere in the "additional pages" that nobody looks at.

Aside from paying for advertising, a well-ranked site has pages that contain laser-targeted HTML title tags, relevant and succinct meta-keywords, H1 tags that are concise, and text near the top of the page that articulates the purpose of the page.  And there are many relevant and well-respected web sites that point to your URL with keywords indicative of the purpose of the page.  Note relevant and well-respected -- the search engines have got the link-farms and the URL-squatters figured out.   If you're selling a product, it helps to have multiple online outlets.  Example: make a Google search for roll-up dog ramps and see what you get.  The patent owner is Ramp4Paws.  But look at how many vendors carry the product and link to the canonical web site!

If you're interested in mod_rewrite, make a Google search for mod rewrite.  It will turn up examples, tutorials, and other good learning resources.  However it may not teach you all the ins-and-outs of regular expressions.  RegEx is a language that is made up almost entirely of punctuation.  I cannot imagine how anyone could concoct a more difficult and complicated syntax.  There are whole books about the subject.  One of the resources I keep at my elbow is the "cheat sheet" that you can download from this URL.
http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet/

Best of luck with it, ~Ray
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

Author

Commented:
Thanks Ray for your advice and suggestions. I'm definitely going to implement a lot of what you suggested. Currently, I only want to use rewrites because I just want pretty urls. I personally would rather see domain.com/pages/term1,term2,term3?page=1&param=2 than domain.com/page.php?query=term1,term2,term3&page=1&param=2. I think the "domain.com/pages/term1,term2,term3' part looks neater, is easier on the eyes, and offers a bit of self explanation. And by use through social media where links are oft times shared, I think the preferred way would look more appealing.
Most Valuable Expert 2011
Top Expert 2016

Commented:
Yeah, I can see your point.  Make a Google search for "pretty URLs" and you'll probably find some good examples.  Best of luck with it, ~Ray
I see, sorry that didn't work for you, I misread to commas.  I also mistook the 3rd query string for its own parameter.  The rewrite rule will also change the page=1 and param=2 as well.

# for your .htaccess file
# $1 will be all 3 search terms, use your PHP file to separate them with comma explode
# and to make sure that term3, page, and param are integers
RewriteRule ^page/([^/]+)/([^/]+)/([^/]+) /page.php?query=$1&page=$2&param=$3
RewriteRule ^pages /page.php

Open in new window


This should parse for the an example like this:

domain.com/pages/term1,term2,term3/1/2

Also, be sure that your PHP file validates all 3 terms, malicious users will try to break the system with XSS or SQL injection there.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial