Search Engine Friendly (SEO) Url's and Special Characters (&$£!"[:]~#?) - Url Length & Other General Advice

I am looking for information and general advice on the use of special characters in the url of my pages and what the implications are in getting picked up by google and search engines in general.  In particular, I am looking for confirmation if the use of the following characters will prevent the search engines from spidering my site (or if there are any penalties).

£
$
~
#
[ ]
( )
:
;
?
| (pipe)

Some urls may have accented characters;  /capcom-gámé-3-(ps2).html  - good or bad?

From what I have seen on our website, the characters : [ ] ( ) | ? have already crept into the urls.  I am aware that the ? sign usually means crawlers want to stop reading after that, but what about the others?

Which is better from the following:

1 - site.com/directory/photography/digital-cameras/sony/cybershot-t100
2 - site.com/directory/photography/digital-cameras/sony/cybershot-t100.html
3 - site.com/directory/photography-digital-cameras-sony-cybershot-t100.html

Lastly, is there any limitation on the length of the url and it being prevented from coming in the search results?  I like to include full descriptive paths in the url to help people travel down the 'breadcrumbs', but is there a general guideline on where to draw the line.

I understand there are many questions here, but I am looking for general advice across the range, and hopefully some confirmation regarding the use of special characters and which ones to avoid.

Thanks!
thyrosAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
teapersonConnect With a Mentor Commented:
Part of the answer to this question lies in the audience of your website: if you are aiming for an English-speaking audience, avoid any character that can't be typed in on an English-language keyboard.  Because provincial English-speakers won't be able to type in your URL.

Regardless, I would also avoid ?, #, and : because they have specific meanings in URLs, and may confuse your users' browsers or even your own webserver.  The other characters, including accented ones,  are probably tolerated ok, but will make for weird looking URLs.

There are two approaches for dealing with these: either squish out any illegal characters, or map them.  So cafè could be caf or cafe.

Your 3 options for having subdirectories or dashes in the URLs are probably roughly equivalent - search engines will use either the / or the - as a delimiter between the words they see in the URL.  There is probably a cost to having too many keywords in the URL: if a user searches on "sony cybershot", that only matches 2 of 8 words in the URL, which will give a lower relevancy boost than matching 2 of 4 words in site.com/sony-cybershot-t100.html.   Plus the URL just gets long and ugly and hard for anyone to type in.

I don't think there's any absolute limit on the number of characters in a URL, but anything that is longer than the URL window at the top of the browser is going to be user-unfriendly.  My educated guess is that anything that long would also be viewed by Google as a sign of spam.
0
 
Loganathan NatarajanLAMP DeveloperCommented:
ok...
0
 
Serena HsiMarketing ConsultantCommented:
Use a robots.txt file for the most part. Read more here:
http://en.wikipedia.org/wiki/Robots.txt

And, use a photo program like photoshop (fee), jalbum (free), GIMP (free)  to watermark all your images.

http://jalbum.net/
http://www.adobe.com/products/photoshop/index.html
http://www.gimp.org/
0
Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

 
Serena HsiMarketing ConsultantCommented:
I'd say that for simplicity, don't add special characters to your web urls.
0
 
thyrosAuthor Commented:
I don't see what watermarking images has to do with this question?

Anyhow as for your second comment, I don't really have much choice to avoid some of the special characters like | [] etc, because they are part of the script/engine powering part of our website database, so we have to use some kind of separators.  

I was hoping to get a definitive guide on what characters are rejected outright by popular search engines and which ones are just tolerated, but this question seems like it is going to die soon.
0
 
thyrosAuthor Commented:
Thanks for your help, this sounds reasonable.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.