Link to home
Start Free TrialLog in
Avatar of FocIS
FocIS

asked on

fonts converting to chinese seemingly random

Here's a strange one that i don't expect many answers from, but throwing it out there..

I have a client who when they send an email (outlook 2010 on pop3 to their ISP), everything appears normal.  but when someone replies (anyone), the original message the client typed, all instances of apostrophe-m are replaced with the Chinese characters 鈥檓

The emails don't leave that way, but they come back that way.  and what the person replies with, their text is like that too.  not just one sender, seemingly any sender who replies

So yes, a weird one here - hopefully someone ran across this in their vast experience and has any suggestion

by the way, if you google search for:  鈥檓
you will see tons and tons of examples where I'm was replaced with I鈥檓
Avatar of Gary
Gary
Flag of Ireland image

What character encoding do you have set on the email?
Avatar of FocIS
FocIS

ASKER

checkmarked is "automatically select encoding for outgoing messages"
selected is Western European (ISO)

I should mention, every other character in the emails appear perfectly fine
I would set a meta tag to specify UTF-8
Without any traceability it's hard to know where the characters are getting converted, but the issue you have is usually associated with the character encoding.
i agree with Gary that using UTF-8 encoding should avoid the problem, and that the underlying problem is associated with character encoding.

I suspect (but can't prove from the example texts) that the root cause may be that the 'apostrophe' character being used in the original source document:

Is not the standard ASCII apostrophe character; this is decimal code 39, hexadecimal 27.
Is (perhaps courtesy of something like Word?) instead the 'right single quotation mark' (sometimes referred to as one of the 'curly quote' characters); this is Unicode code-point U+2019, but (in the Windows ANSI code-set) is mapped to decimal code 146, hexadecimal 92, which is (in Unicode and other code-sets) reserved for the (little used) C1 control-code characters.
Note sure whether or not E-E will display the following correctly, but here goes:

ASCII/Unicode code-point U+0027 is character '
Unicode code-point U+2019 is character
... and (in the Western European (ISO) coded character set (otherwise known as ISO 8859-1), hexadecimal 92 is not a graphic character; it is (as in the Unicode super-set) reserved for one of the C1 control-code characters.
ASKER CERTIFIED SOLUTION
Avatar of DansDadUK
DansDadUK
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
... and I've just come across this rather good article, entitled Unicode, PHP, and Character Collisions, written by Ray Paseur, which provides some more background on such character encoding problems.