Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


Google Rank and Character Set.

Posted on 2006-11-06
Medium Priority
Last Modified: 2013-12-03

Quick question about the relationship of character sets to your ranking in google.

I just finished converting my site over from Shift_jis to Utf-8.  There were some minor changes other than this with the re arrangement of the heading but i generally took care to keep the focus on keeping the Keywords prominent and in the correct tags.

Google respidered yesterday and my long term ranking in the top 5 dissappeared and now im not visible on any page for that keyword.

Im woundering if anybody knows anything that relates to the character set affecting the page rank.

Things that have crossed my mind:
1. when the pages changed from shift_jis to utf-8 the file size increased on average by about 15-20%.
-- could this file size increase affect the keyword density?

2. Shift_jis is a japanese font set and the site is in japanese.  the keyword that is important to me is (in japanese)'natural stones'  which is contianed in the phrase 'natural stones store'.  we use this phrase for all our link text.
-- could changing the character set to UTF-8 have caused google to read the japanese text differently and now google thinks 'natural stones store' is a single word?
(we are usually #1 for 'natural stones store' and this is still the case even after the change to utf8)

3. doing 'link:sitename.com'(in google) returns that we have 12 links.....Weird...since i thought that we had about 1200.
-- I had never checked this before so i don't know if that is a result of ther change or not.

any ideas?


Question by:ussher
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5

Author Comment

ID: 17887222

Ok, Here is an interesting twist.

When searching on my desired keyword in google japan my index page is nowhere to be found, (all the results are in japanese on google page 1)

Under the google search box there is the radio button option to 'display only japanese pages' and when this button is checked my site returns to #4.


when 'display all results' and 'display only japanese pages' are changed the only difference is that my site pops in at #4.  all others seam to stay the same.

Any ideas?

LVL 33

Expert Comment

ID: 17889726
A change like this can impact your rankings if you declare a character set and then don't follow it - something to take a look at.  Basically, if your site doesn't follow the declaration, it is considered technically flawed, which is a ranking factor.  Also, any kind of fundamental change can have a temporary impact and it is often just a matter of waiting for things to be reindexed.

Additionally, there seem to be some general SEO misperceptions here, so let's clear those up ...

"... i generally took care to keep the focus on keeping the Keywords prominent and in the correct tags."

Which tags?  If you are referring to meta tags, you should be aware that they are not typically a ranking factor.  The only meta tag that is useful is the description meta tag because search engines will sometimes use its contents as a description of your site in their results pages.  There is nothing you can put in any of the tags, however, that will help your rankings.  You can, however, harm your rankings by stuffing the tags full of keywords, in which case the search engines will identify you as a keyword spammer.

The title element, on the other hand, is not a meta tag (although some people think of it as one).  It is an important ranking factor, so you should choose your page titles carefully and include your keywords.  In terms of page content, you should provide effective, readable content in which your keywords occur naturally.

"1. ... could this file size increase affect the keyword density?"

Here's everything you need to know about keyword density -> "If you ever find yourself reading an article on the importance of keyword density in SEO you can safely assume that the author doesn’t know what they are talking about. Keyword density is not a phrase or even a concept that is used by search engineers working for Google, Yahoo or MSN, it is simply a fiction invented by the lower echelons of the SEO community." (source - Michael Duz www.seo-blog.com/keyword-density.php)

If you want a more technical explanation, read 'The Keyword Density of Non-Sense' by Dr. Edel Garcia.  It explains how search engines view text content.

"3. doing 'link:sitename.com'(in google) returns that we have 12 links.....Weird...since i thought that we had about 1200.
-- I had never checked this before so i don't know if that is a result of ther change or not."

Google only returns a sample of the links they know about.  Also, when it comes to incoming links, quality is more important than quantity.  If you have a lot of links from link farms & low-end pages and reciprocal linking schemes (link exchanges), Google may not know about them and it wouldn't help you if they did.

Author Comment

ID: 17895216
cheers humeniuk,

I did not know about the keyword density, thanks.

I suspect i have found the source of the problem.
my site is: http://sena-cos.com

when i use the 'keyword density tool' from here

with the title from the <h1>tags (its in japanese so willl not render properly on experts exchange)
<h1>XXXXX SenaCos</h1>

The keyword i want to rank highly for is the first three characters
'XXXXX senacos'
'XXX'                   = 'natural stones'
      'XX'               = 'store'
             'senacos' = (our company name)

1. enter site name into 'keyword density tool'
2. press submit
3. change firefox browser encoding to utf8 (VIEW -> CHARACTER ENCIDING -> UTF8)

Under the title 'Page elements'
lists the things that their spider found on the page and the first character is a '?'

The rest of the characters are ok and look correct except every instance of the first character is unreadable.
so to this site my h1 tags would read
<h1>?XXXX senacos</h1>

The character in question is:
unicode:        E5 A4 A9
html code:     &#x5929

if i change my source code and replace the first character with its html code equivelant then the character renders correctly using the same process.

I suspect this could be the reason that my page is ranking #4 on google when the google setting is on "Display only japanese pages" and not visible when the on international.

Google search on international setting:

Google search on japanese only pages

Any ideas on this.
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

LVL 33

Expert Comment

ID: 17903590
Hmmm - it sounds like you're on the right track.  Google recommends looking at your page with a text browser such as Lynx to get an idea on what their spiders see (you can also use a simulator like this - www.delorie.com/web/lynxview.html).  That is how they see pages - not necessarily the same as how IE or Firefox will render them.  It tends to uncover the kinds of problems that you have uncovered here.

I haven't worked with Japanese characters myself, though, so I can't say too much about this specific situation.

Author Comment

ID: 17903961
Hi humeniuk,

Great idea with the text browser.  Ill spend a bit more time looking for one that can work with CJK (chineese, japanese, korean) displays.  Unfortunately lynx doesn't work very well with CJK in utf8 mode apparently.

The link to delore.com above just returned all unreadable characters as well.

I have found that the character that im chasing after E5 A4 A9 is being rendered as EF BF BD in the 'keyword density tool'  with a little investigation found that
is a UTF8 'boundary marker' of the same type that is used in the BOM sometimes attached to utf8 documents.

Here is another weird thing.

If i go through a proxy (proxify.com) and look up google japan(google.co.jp) then enter my keyword that i want in here, then i DO appear at the #4 spot.
here is that link with the keyword entered:

any thoughts welcomed.
LVL 33

Accepted Solution

humeniuk earned 1500 total points
ID: 17908008
It is certainly an odd situation.  I know that Google has worked at dealing with circumstances like this, but I'm not sure quite what to suggest.  There may be some benefit to contacting Google directly or posting in the appropriate Google Group to see what kind of insight you can get there.

Author Comment

ID: 17935590
An update on this.

I did post to the google news group 'Google Webmaster Help'  but got no reply.  However my site has appeared in google.co.jp so im very happy.  I don't know why as i have not altered anything.

Thank you humeniuk for your thoughts and ideas.
LVL 33

Expert Comment

ID: 17936091
As noted in my first post, "... any kind of fundamental change can have a temporary impact and it is often just a matter of waiting for things to be reindexed."  I would say there is a good chance that was the case here.

Author Comment

ID: 17936528
OK smart guy.  Explain the reason why it was visible by proxy... ;)

Thanks for your help humeniuk. Honestly much appreciated.
LVL 33

Expert Comment

ID: 17947374
"OK smart guy.  Explain the reason why it was visible by proxy... ;)"

Different data centers can sometimes give different results during the update period.

Or ...

One of the early developments in Google's move towards personalized search results is to factor in the geographic location of the person sending the search query.  I regularly received different search results than clients in the US and England (I'm in Canada).

By virtue of the fact that it simulates a different IP, a proxy inherently simulates a different geographic location, hence the possibility of different results.

As Marshall McLuhan said, "If you don't like that idea, I have others."   ;)

Featured Post

Eye-catchers on the conference table

Challenge: The i-unit group was not satisfied with the audio quality during remote meetings. They were looking for a portable solution with excellent audio quality for use in their conference room but also at their client’s offices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this blog, I will share you some basic tips for content marketing and to rank your website on Google.
SEO can be a real minefield to navigate, but there are three simple ways to up your SEO game just be re-assessing your content output.
An overview of how to create reports in Adobe Analytics (formerly Omniture Site Catalyst) using pageNames, events, eVars and props. This video will show you how to install the Omniture Debugger tool so can see (and test) what is being passed int…
This Micro Tutorial will demonstrate how to add subdomains to your content reports. This can be very importing in having a site with multiple subdomains.

718 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question