Solved

Google Rank and Character Set.

Posted on 2006-11-06
10
466 Views
Last Modified: 2013-12-03
Hi,

Quick question about the relationship of character sets to your ranking in google.

I just finished converting my site over from Shift_jis to Utf-8.  There were some minor changes other than this with the re arrangement of the heading but i generally took care to keep the focus on keeping the Keywords prominent and in the correct tags.

Google respidered yesterday and my long term ranking in the top 5 dissappeared and now im not visible on any page for that keyword.

Im woundering if anybody knows anything that relates to the character set affecting the page rank.

Things that have crossed my mind:
1. when the pages changed from shift_jis to utf-8 the file size increased on average by about 15-20%.
-- could this file size increase affect the keyword density?

2. Shift_jis is a japanese font set and the site is in japanese.  the keyword that is important to me is (in japanese)'natural stones'  which is contianed in the phrase 'natural stones store'.  we use this phrase for all our link text.
-- could changing the character set to UTF-8 have caused google to read the japanese text differently and now google thinks 'natural stones store' is a single word?
(we are usually #1 for 'natural stones store' and this is still the case even after the change to utf8)

3. doing 'link:sitename.com'(in google) returns that we have 12 links.....Weird...since i thought that we had about 1200.
-- I had never checked this before so i don't know if that is a result of ther change or not.


any ideas?

Cheers

Michael
0
Comment
Question by:ussher
  • 5
  • 5
10 Comments
 
LVL 1

Author Comment

by:ussher
ID: 17887222
Hi,

Ok, Here is an interesting twist.

When searching on my desired keyword in google japan my index page is nowhere to be found, (all the results are in japanese on google page 1)

Under the google search box there is the radio button option to 'display only japanese pages' and when this button is checked my site returns to #4.

Weird.

when 'display all results' and 'display only japanese pages' are changed the only difference is that my site pops in at #4.  all others seam to stay the same.

Any ideas?

0
 
LVL 33

Expert Comment

by:humeniuk
ID: 17889726
A change like this can impact your rankings if you declare a character set and then don't follow it - something to take a look at.  Basically, if your site doesn't follow the declaration, it is considered technically flawed, which is a ranking factor.  Also, any kind of fundamental change can have a temporary impact and it is often just a matter of waiting for things to be reindexed.

Additionally, there seem to be some general SEO misperceptions here, so let's clear those up ...

"... i generally took care to keep the focus on keeping the Keywords prominent and in the correct tags."

Which tags?  If you are referring to meta tags, you should be aware that they are not typically a ranking factor.  The only meta tag that is useful is the description meta tag because search engines will sometimes use its contents as a description of your site in their results pages.  There is nothing you can put in any of the tags, however, that will help your rankings.  You can, however, harm your rankings by stuffing the tags full of keywords, in which case the search engines will identify you as a keyword spammer.

The title element, on the other hand, is not a meta tag (although some people think of it as one).  It is an important ranking factor, so you should choose your page titles carefully and include your keywords.  In terms of page content, you should provide effective, readable content in which your keywords occur naturally.

"1. ... could this file size increase affect the keyword density?"

Here's everything you need to know about keyword density -> "If you ever find yourself reading an article on the importance of keyword density in SEO you can safely assume that the author doesn’t know what they are talking about. Keyword density is not a phrase or even a concept that is used by search engineers working for Google, Yahoo or MSN, it is simply a fiction invented by the lower echelons of the SEO community." (source - Michael Duz www.seo-blog.com/keyword-density.php)

If you want a more technical explanation, read 'The Keyword Density of Non-Sense' by Dr. Edel Garcia.  It explains how search engines view text content.

"3. doing 'link:sitename.com'(in google) returns that we have 12 links.....Weird...since i thought that we had about 1200.
-- I had never checked this before so i don't know if that is a result of ther change or not."

Google only returns a sample of the links they know about.  Also, when it comes to incoming links, quality is more important than quantity.  If you have a lot of links from link farms & low-end pages and reciprocal linking schemes (link exchanges), Google may not know about them and it wouldn't help you if they did.
0
 
LVL 1

Author Comment

by:ussher
ID: 17895216
cheers humeniuk,

I did not know about the keyword density, thanks.

I suspect i have found the source of the problem.
my site is: http://sena-cos.com

when i use the 'keyword density tool' from here
http://www.ranks.nl/cgi-bin/ranksnl/spider/spider.cgi?lang=

with the title from the <h1>tags (its in japanese so willl not render properly on experts exchange)
<h1>XXXXX SenaCos</h1>

The keyword i want to rank highly for is the first three characters
'XXXXX senacos'
'XXX'                   = 'natural stones'
      'XX'               = 'store'
             'senacos' = (our company name)

1. enter site name into 'keyword density tool'
2. press submit
3. change firefox browser encoding to utf8 (VIEW -> CHARACTER ENCIDING -> UTF8)

Under the title 'Page elements'
lists the things that their spider found on the page and the first character is a '?'

The rest of the characters are ok and look correct except every instance of the first character is unreadable.
so to this site my h1 tags would read
<h1>?XXXX senacos</h1>

The character in question is:
unicode:        E5 A4 A9
html code:     &#x5929

if i change my source code and replace the first character with its html code equivelant then the character renders correctly using the same process.

I suspect this could be the reason that my page is ranking #4 on google when the google setting is on "Display only japanese pages" and not visible when the on international.

Google search on international setting:
http://www.google.co.jp/search?hl=ja&q=%E5%A4%A9%E7%84%B6%E7%9F%B3&btnG=Google+%E6%A4%9C%E7%B4%A2&lr=

Google search on japanese only pages
http://www.google.co.jp/search?hl=ja&q=%E5%A4%A9%E7%84%B6%E7%9F%B3&btnG=Google+%E6%A4%9C%E7%B4%A2&lr=lang_ja

Any ideas on this.
0
 
LVL 33

Expert Comment

by:humeniuk
ID: 17903590
Hmmm - it sounds like you're on the right track.  Google recommends looking at your page with a text browser such as Lynx to get an idea on what their spiders see (you can also use a simulator like this - www.delorie.com/web/lynxview.html).  That is how they see pages - not necessarily the same as how IE or Firefox will render them.  It tends to uncover the kinds of problems that you have uncovered here.

I haven't worked with Japanese characters myself, though, so I can't say too much about this specific situation.
0
 
LVL 1

Author Comment

by:ussher
ID: 17903961
Hi humeniuk,

Great idea with the text browser.  Ill spend a bit more time looking for one that can work with CJK (chineese, japanese, korean) displays.  Unfortunately lynx doesn't work very well with CJK in utf8 mode apparently.

The link to delore.com above just returned all unreadable characters as well.

I have found that the character that im chasing after E5 A4 A9 is being rendered as EF BF BD in the 'keyword density tool'  with a little investigation found that
EF BF BD
is a UTF8 'boundary marker' of the same type that is used in the BOM sometimes attached to utf8 documents.
(http://smontagu.damowmow.com/utf8test.html)




Here is another weird thing.

If i go through a proxy (proxify.com) and look up google japan(google.co.jp) then enter my keyword that i want in here, then i DO appear at the #4 spot.
here is that link with the keyword entered:
https://proxify.com/p/011010A1000110/687474703a2f2f7777772e676f6f676c652e636f2e6a702f736561726368?hl=ja&q=%E5%A4%A9%E7%84%B6%E7%9F%B3&btnG=Google+%E6%A4%9C%E7%B4%A2&lr=

any thoughts welcomed.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 33

Accepted Solution

by:
humeniuk earned 500 total points
ID: 17908008
It is certainly an odd situation.  I know that Google has worked at dealing with circumstances like this, but I'm not sure quite what to suggest.  There may be some benefit to contacting Google directly or posting in the appropriate Google Group to see what kind of insight you can get there.
0
 
LVL 1

Author Comment

by:ussher
ID: 17935590
An update on this.

I did post to the google news group 'Google Webmaster Help'  but got no reply.  However my site has appeared in google.co.jp so im very happy.  I don't know why as i have not altered anything.

Thank you humeniuk for your thoughts and ideas.
0
 
LVL 33

Expert Comment

by:humeniuk
ID: 17936091
As noted in my first post, "... any kind of fundamental change can have a temporary impact and it is often just a matter of waiting for things to be reindexed."  I would say there is a good chance that was the case here.
0
 
LVL 1

Author Comment

by:ussher
ID: 17936528
OK smart guy.  Explain the reason why it was visible by proxy... ;)

Thanks for your help humeniuk. Honestly much appreciated.
0
 
LVL 33

Expert Comment

by:humeniuk
ID: 17947374
"OK smart guy.  Explain the reason why it was visible by proxy... ;)"

Different data centers can sometimes give different results during the update period.

Or ...

One of the early developments in Google's move towards personalized search results is to factor in the geographic location of the person sending the search query.  I regularly received different search results than clients in the US and England (I'm in Canada).

By virtue of the fact that it simulates a different IP, a proxy inherently simulates a different geographic location, hence the possibility of different results.

As Marshall McLuhan said, "If you don't like that idea, I have others."   ;)
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

A/B testing is a simple and effective trick to get to know your audience, increase website conversions and make the most out of your online ad campaigns. It's widely available and doesn't need much tech knowledge to be executed, but the results it y…
Read about how to approach blogging and about ways to do it right. Stand out from the crowd and let your knowledge be consumed by a large audience. This article aims to explain how your blog should look like,  the most important things to do while b…
Viewers will get an overview of the benefits and risks of using Bitcoin to accept payments. What Bitcoin is: Legality: Risks: Benefits: Which businesses are best suited?: Other things you should know: How to get started:
Use Wufoo, an online form creation tool, to make powerful forms. Learn how to selectively show certain fields based on user input using rules to gather relevant information and data from your forms. The rules feature provides you with an opportunity…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now