Link to home
Start Free TrialLog in
Avatar of ussher
ussherFlag for Japan

asked on

Google Rank and Character Set.

Hi,

Quick question about the relationship of character sets to your ranking in google.

I just finished converting my site over from Shift_jis to Utf-8.  There were some minor changes other than this with the re arrangement of the heading but i generally took care to keep the focus on keeping the Keywords prominent and in the correct tags.

Google respidered yesterday and my long term ranking in the top 5 dissappeared and now im not visible on any page for that keyword.

Im woundering if anybody knows anything that relates to the character set affecting the page rank.

Things that have crossed my mind:
1. when the pages changed from shift_jis to utf-8 the file size increased on average by about 15-20%.
-- could this file size increase affect the keyword density?

2. Shift_jis is a japanese font set and the site is in japanese.  the keyword that is important to me is (in japanese)'natural stones'  which is contianed in the phrase 'natural stones store'.  we use this phrase for all our link text.
-- could changing the character set to UTF-8 have caused google to read the japanese text differently and now google thinks 'natural stones store' is a single word?
(we are usually #1 for 'natural stones store' and this is still the case even after the change to utf8)

3. doing 'link:sitename.com'(in google) returns that we have 12 links.....Weird...since i thought that we had about 1200.
-- I had never checked this before so i don't know if that is a result of ther change or not.


any ideas?

Cheers

Michael
Avatar of ussher
ussher
Flag of Japan image

ASKER

Hi,

Ok, Here is an interesting twist.

When searching on my desired keyword in google japan my index page is nowhere to be found, (all the results are in japanese on google page 1)

Under the google search box there is the radio button option to 'display only japanese pages' and when this button is checked my site returns to #4.

Weird.

when 'display all results' and 'display only japanese pages' are changed the only difference is that my site pops in at #4.  all others seam to stay the same.

Any ideas?

A change like this can impact your rankings if you declare a character set and then don't follow it - something to take a look at.  Basically, if your site doesn't follow the declaration, it is considered technically flawed, which is a ranking factor.  Also, any kind of fundamental change can have a temporary impact and it is often just a matter of waiting for things to be reindexed.

Additionally, there seem to be some general SEO misperceptions here, so let's clear those up ...

"... i generally took care to keep the focus on keeping the Keywords prominent and in the correct tags."

Which tags?  If you are referring to meta tags, you should be aware that they are not typically a ranking factor.  The only meta tag that is useful is the description meta tag because search engines will sometimes use its contents as a description of your site in their results pages.  There is nothing you can put in any of the tags, however, that will help your rankings.  You can, however, harm your rankings by stuffing the tags full of keywords, in which case the search engines will identify you as a keyword spammer.

The title element, on the other hand, is not a meta tag (although some people think of it as one).  It is an important ranking factor, so you should choose your page titles carefully and include your keywords.  In terms of page content, you should provide effective, readable content in which your keywords occur naturally.

"1. ... could this file size increase affect the keyword density?"

Here's everything you need to know about keyword density -> "If you ever find yourself reading an article on the importance of keyword density in SEO you can safely assume that the author doesn’t know what they are talking about. Keyword density is not a phrase or even a concept that is used by search engineers working for Google, Yahoo or MSN, it is simply a fiction invented by the lower echelons of the SEO community." (source - Michael Duz www.seo-blog.com/keyword-density.php)

If you want a more technical explanation, read 'The Keyword Density of Non-Sense' by Dr. Edel Garcia.  It explains how search engines view text content.

"3. doing 'link:sitename.com'(in google) returns that we have 12 links.....Weird...since i thought that we had about 1200.
-- I had never checked this before so i don't know if that is a result of ther change or not."

Google only returns a sample of the links they know about.  Also, when it comes to incoming links, quality is more important than quantity.  If you have a lot of links from link farms & low-end pages and reciprocal linking schemes (link exchanges), Google may not know about them and it wouldn't help you if they did.
Avatar of ussher

ASKER

cheers humeniuk,

I did not know about the keyword density, thanks.

I suspect i have found the source of the problem.
my site is: http://sena-cos.com

when i use the 'keyword density tool' from here
http://www.ranks.nl/cgi-bin/ranksnl/spider/spider.cgi?lang=

with the title from the <h1>tags (its in japanese so willl not render properly on experts exchange)
<h1>XXXXX SenaCos</h1>

The keyword i want to rank highly for is the first three characters
'XXXXX senacos'
'XXX'                   = 'natural stones'
      'XX'               = 'store'
             'senacos' = (our company name)

1. enter site name into 'keyword density tool'
2. press submit
3. change firefox browser encoding to utf8 (VIEW -> CHARACTER ENCIDING -> UTF8)

Under the title 'Page elements'
lists the things that their spider found on the page and the first character is a '?'

The rest of the characters are ok and look correct except every instance of the first character is unreadable.
so to this site my h1 tags would read
<h1>?XXXX senacos</h1>

The character in question is:
unicode:        E5 A4 A9
html code:     &#x5929

if i change my source code and replace the first character with its html code equivelant then the character renders correctly using the same process.

I suspect this could be the reason that my page is ranking #4 on google when the google setting is on "Display only japanese pages" and not visible when the on international.

Google search on international setting:
http://www.google.co.jp/search?hl=ja&q=%E5%A4%A9%E7%84%B6%E7%9F%B3&btnG=Google+%E6%A4%9C%E7%B4%A2&lr=

Google search on japanese only pages
http://www.google.co.jp/search?hl=ja&q=%E5%A4%A9%E7%84%B6%E7%9F%B3&btnG=Google+%E6%A4%9C%E7%B4%A2&lr=lang_ja

Any ideas on this.
Hmmm - it sounds like you're on the right track.  Google recommends looking at your page with a text browser such as Lynx to get an idea on what their spiders see (you can also use a simulator like this - www.delorie.com/web/lynxview.html).  That is how they see pages - not necessarily the same as how IE or Firefox will render them.  It tends to uncover the kinds of problems that you have uncovered here.

I haven't worked with Japanese characters myself, though, so I can't say too much about this specific situation.
Avatar of ussher

ASKER

Hi humeniuk,

Great idea with the text browser.  Ill spend a bit more time looking for one that can work with CJK (chineese, japanese, korean) displays.  Unfortunately lynx doesn't work very well with CJK in utf8 mode apparently.

The link to delore.com above just returned all unreadable characters as well.

I have found that the character that im chasing after E5 A4 A9 is being rendered as EF BF BD in the 'keyword density tool'  with a little investigation found that
EF BF BD
is a UTF8 'boundary marker' of the same type that is used in the BOM sometimes attached to utf8 documents.
(http://smontagu.damowmow.com/utf8test.html)




Here is another weird thing.

If i go through a proxy (proxify.com) and look up google japan(google.co.jp) then enter my keyword that i want in here, then i DO appear at the #4 spot.
here is that link with the keyword entered:
https://proxify.com/p/011010A1000110/687474703a2f2f7777772e676f6f676c652e636f2e6a702f736561726368?hl=ja&q=%E5%A4%A9%E7%84%B6%E7%9F%B3&btnG=Google+%E6%A4%9C%E7%B4%A2&lr=

any thoughts welcomed.
ASKER CERTIFIED SOLUTION
Avatar of humeniuk
humeniuk
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ussher

ASKER

An update on this.

I did post to the google news group 'Google Webmaster Help'  but got no reply.  However my site has appeared in google.co.jp so im very happy.  I don't know why as i have not altered anything.

Thank you humeniuk for your thoughts and ideas.
As noted in my first post, "... any kind of fundamental change can have a temporary impact and it is often just a matter of waiting for things to be reindexed."  I would say there is a good chance that was the case here.
Avatar of ussher

ASKER

OK smart guy.  Explain the reason why it was visible by proxy... ;)

Thanks for your help humeniuk. Honestly much appreciated.
"OK smart guy.  Explain the reason why it was visible by proxy... ;)"

Different data centers can sometimes give different results during the update period.

Or ...

One of the early developments in Google's move towards personalized search results is to factor in the geographic location of the person sending the search query.  I regularly received different search results than clients in the US and England (I'm in Canada).

By virtue of the fact that it simulates a different IP, a proxy inherently simulates a different geographic location, hence the possibility of different results.

As Marshall McLuhan said, "If you don't like that idea, I have others."   ;)