Solved

Search Engine Friendly (SEO) Url's and Special Characters (&$£!"[:]~#?) - Url Length & Other General Advice

Posted on 2007-11-14
7
4,944 Views
Last Modified: 2013-12-09
I am looking for information and general advice on the use of special characters in the url of my pages and what the implications are in getting picked up by google and search engines in general.  In particular, I am looking for confirmation if the use of the following characters will prevent the search engines from spidering my site (or if there are any penalties).

£
$
~
#
[ ]
( )
:
;
?
| (pipe)

Some urls may have accented characters;  /capcom-gámé-3-(ps2).html  - good or bad?

From what I have seen on our website, the characters : [ ] ( ) | ? have already crept into the urls.  I am aware that the ? sign usually means crawlers want to stop reading after that, but what about the others?

Which is better from the following:

1 - site.com/directory/photography/digital-cameras/sony/cybershot-t100
2 - site.com/directory/photography/digital-cameras/sony/cybershot-t100.html
3 - site.com/directory/photography-digital-cameras-sony-cybershot-t100.html

Lastly, is there any limitation on the length of the url and it being prevented from coming in the search results?  I like to include full descriptive paths in the url to help people travel down the 'breadcrumbs', but is there a general guideline on where to draw the line.

I understand there are many questions here, but I am looking for general advice across the range, and hopefully some confirmation regarding the use of special characters and which ones to avoid.

Thanks!
0
Comment
Question by:thyros
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 36

Expert Comment

by:Loganathan Natarajan
ID: 20281598
ok...
0
 
LVL 19

Expert Comment

by:Serena Hsi
ID: 20359425
Use a robots.txt file for the most part. Read more here:
http://en.wikipedia.org/wiki/Robots.txt

And, use a photo program like photoshop (fee), jalbum (free), GIMP (free)  to watermark all your images.

http://jalbum.net/
http://www.adobe.com/products/photoshop/index.html
http://www.gimp.org/
0
 
LVL 19

Expert Comment

by:Serena Hsi
ID: 20359435
I'd say that for simplicity, don't add special characters to your web urls.
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 

Author Comment

by:thyros
ID: 20602521
I don't see what watermarking images has to do with this question?

Anyhow as for your second comment, I don't really have much choice to avoid some of the special characters like | [] etc, because they are part of the script/engine powering part of our website database, so we have to use some kind of separators.  

I was hoping to get a definitive guide on what characters are rejected outright by popular search engines and which ones are just tolerated, but this question seems like it is going to die soon.
0
 
LVL 2

Accepted Solution

by:
teaperson earned 500 total points
ID: 20668984
Part of the answer to this question lies in the audience of your website: if you are aiming for an English-speaking audience, avoid any character that can't be typed in on an English-language keyboard.  Because provincial English-speakers won't be able to type in your URL.

Regardless, I would also avoid ?, #, and : because they have specific meanings in URLs, and may confuse your users' browsers or even your own webserver.  The other characters, including accented ones,  are probably tolerated ok, but will make for weird looking URLs.

There are two approaches for dealing with these: either squish out any illegal characters, or map them.  So cafè could be caf or cafe.

Your 3 options for having subdirectories or dashes in the URLs are probably roughly equivalent - search engines will use either the / or the - as a delimiter between the words they see in the URL.  There is probably a cost to having too many keywords in the URL: if a user searches on "sony cybershot", that only matches 2 of 8 words in the URL, which will give a lower relevancy boost than matching 2 of 4 words in site.com/sony-cybershot-t100.html.   Plus the URL just gets long and ugly and hard for anyone to type in.

I don't think there's any absolute limit on the number of characters in a URL, but anything that is longer than the URL window at the top of the browser is going to be user-unfriendly.  My educated guess is that anything that long would also be viewed by Google as a sign of spam.
0
 

Author Closing Comment

by:thyros
ID: 31409179
Thanks for your help, this sounds reasonable.
0

Featured Post

Forrester Webinar: xMatters Delivers 261% ROI

Guest speaker Dean Davison, Forrester Principal Consultant, explains how a Fortune 500 communication company using xMatters found these results: Achieved a 261% ROI, Experienced $753,280 in net present value benefits over 3 years and Reduced MTTR by 91% for tier 1 incidents.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Although a lot of people devote their energy toward marketing for specific industries, there are some basic principles that can be applied to any sector imaginable. We’ll look at four steps to take and examine how those steps were put into action fo…
Australian government abolished Visa 457 earlier this April and this article describes how this decision might affect Australian IT scene and IT experts.
The viewer will learn how to dynamically set the form action using jQuery.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question