Solved

fonts converting to chinese seemingly random

Posted on 2014-12-21
8
126 Views
Last Modified: 2015-03-03
Here's a strange one that i don't expect many answers from, but throwing it out there..

I have a client who when they send an email (outlook 2010 on pop3 to their ISP), everything appears normal.  but when someone replies (anyone), the original message the client typed, all instances of apostrophe-m are replaced with the Chinese characters 鈥檓

The emails don't leave that way, but they come back that way.  and what the person replies with, their text is like that too.  not just one sender, seemingly any sender who replies

So yes, a weird one here - hopefully someone ran across this in their vast experience and has any suggestion

by the way, if you google search for:  鈥檓
you will see tons and tons of examples where I'm was replaced with I鈥檓
0
Comment
Question by:FocIS
  • 5
  • 2
8 Comments
 
LVL 58

Expert Comment

by:Gary
ID: 40511706
What character encoding do you have set on the email?
0
 
LVL 2

Author Comment

by:FocIS
ID: 40512104
checkmarked is "automatically select encoding for outgoing messages"
selected is Western European (ISO)

I should mention, every other character in the emails appear perfectly fine
0
 
LVL 58

Expert Comment

by:Gary
ID: 40512108
I would set a meta tag to specify UTF-8
Without any traceability it's hard to know where the characters are getting converted, but the issue you have is usually associated with the character encoding.
0
 
LVL 16

Expert Comment

by:DansDadUK
ID: 40512850
i agree with Gary that using UTF-8 encoding should avoid the problem, and that the underlying problem is associated with character encoding.

I suspect (but can't prove from the example texts) that the root cause may be that the 'apostrophe' character being used in the original source document:

Is not the standard ASCII apostrophe character; this is decimal code 39, hexadecimal 27.
Is (perhaps courtesy of something like Word?) instead the 'right single quotation mark' (sometimes referred to as one of the 'curly quote' characters); this is Unicode code-point U+2019, but (in the Windows ANSI code-set) is mapped to decimal code 146, hexadecimal 92, which is (in Unicode and other code-sets) reserved for the (little used) C1 control-code characters.
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 16

Expert Comment

by:DansDadUK
ID: 40512863
Note sure whether or not E-E will display the following correctly, but here goes:

ASCII/Unicode code-point U+0027 is character '
Unicode code-point U+2019 is character
0
 
LVL 16

Expert Comment

by:DansDadUK
ID: 40512874
... and (in the Western European (ISO) coded character set (otherwise known as ISO 8859-1), hexadecimal 92 is not a graphic character; it is (as in the Unicode super-set) reserved for one of the C1 control-code characters.
0
 
LVL 16

Accepted Solution

by:
DansDadUK earned 500 total points
ID: 40514669
A few more diagnostics:

Saving this web page, then viewing it within a hexadecimal editor shows that the several instances of the characters 鈥檓 are each represented by the hexadecimal code e988a5e6aa93.

Looking at this in more detail:

hexadecimal e988a5 is the UTF-8 representation of the 16-bit Unicode value U+9225, which is the character
hexadecimal e6aa93 is the UTF-8 representation of the 16-bit Unicode value U+6a93, which is the character

Note that the most-significant byte of the first 16-bit encoded value is 0x92, which (as mentioned earlier) is the code-point associated with the  'right single quotation mark' in the Windows ANSI coded character set (codepage 1252).

Your easiest way of checking whether the above has anything to do with your symptoms is to switch off use of smart (curly) quotes in the text editor (probably Word) used by your Outlook user.
See support page for details.
0
 
LVL 16

Expert Comment

by:DansDadUK
ID: 40641836
... and I've just come across this rather good article, entitled Unicode, PHP, and Character Collisions, written by Ray Paseur, which provides some more background on such character encoding problems.
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Microsoft Word is a program we have all encountered at some point, but very few of us have dug deep into its full scope of features, let alone customized it to suit our needs. Luckily making the ribbon (aka toolbar, first introduced in Word 2007) wo…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now