Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


What is the difference between Western European (ISO) and Unicode (UTF-8) character types?

Posted on 2016-11-18
Medium Priority
Last Modified: 2016-11-23
I see that most all character types are set to Western Europe (ISO) (In our On-Prem Exchange and in O365 Exchange Online)  - Is there a reason for this? What implications would occur if we change it to Unicode (UTF-8)?
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
LVL 97

Expert Comment

by:John Hurst
ID: 41893485
Depends on what character. ISO is single byte (256 characters) and UTF is multi-byte. If you are using only single byte  then you should not see a difference.

Author Comment

by:Health Payment Systems
ID: 41893634
I guess I don't understand when you say single-byte and multi-byte as well as what characters we use? The reason i'm asking is that we have an application that sends out emails. There is working in the subject line which contains the trademark symbol, which for some mail domains, get messed up. It was suggested that possibly changing the character type on our exchange server to Unicode (UTF-8) would fix that. We just don't know if that would also break anything as well.
LVL 97

Assisted Solution

by:John Hurst
John Hurst earned 1000 total points
ID: 41893641
A UTF character contains more than 256 bits (512 bits most likely). So long as the first 256 bits are the same for your character sets, you should not have any issue.
Office 365 Training for Admins - 7 Day Trial

Learn how to provision tenants, synchronize on-premise Active Directory, implement Single Sign-On, customize Office deployment, and protect your organization with eDiscovery and DLP policies.  Only from Platform Scholar.

LVL 16

Accepted Solution

DansDadUK earned 1000 total points
ID: 41894167
The ISO 8859-1 "Western European" coded character set does not include the Trademark symbol:

Print of grid showing ISO 8859-1 character set
The printed grid above show the characters in that character set (using the Courier typeface).
The four-character values shown at the top of each cell are the Unicode code-points; note that these are exactly the same (after excluding the leading "00" characters) as the character codes in that character set; i.e. ISO 8859-1 is an exact subset of Unicode.

To display/print the Trade Mark Sign (Unicode code-point U+2122) means that you'd have to select a different character set; this would be one of:
A different 8-bit coded-character set (which then means that some of the ISO 8859-1 characters would probably not be available, or at least not where they were expected, by standard systems.
Unicode; this reserves a unique code-point for all of the characters in all of the languages currently used in the world (and some older ones as well).

There are several different ways of representing Unicode code-points (which can range from U+0000 to U+10FFFF).
To avoid all characters having to be encoded using two (or more) bytes, Unicode is most commonly encoded using the UTF-8 transformation format;
With UTF-8, all of the first 128 characters are encoded using a single byte (which means that the UTF-8 value is the same as the ASCII value), but all other character require two or more bytes.

So, for example:

'Latin Capital Letter A', at U+0041, is encoded in UTF-8 asa single-byte (decimal) 65, or (hexadecimal) A1 value; this is the same as the ASCII character code value.
'Trade Mark Sign', at U+2122, is encoded in UTF-8 as the three bytes (hexadecimal) E284A2; if this value is decoded within a system expecting ISO 8859-1, it will probably show as something like 'â¢'.

Whatever coded character set you choose, you have to ensure that each end of a 'transaction' (and points in between) all have to know which coded character set is in use.

To avoid the possibility (in the Western world) of having to use multiple 8-bit character sets, it is best to choose UTF-8, since this is becoming the de-facto 'lingua franca'.
Asian languages have to use 16-bit character sets (e.g. Shift-JIS, GBK, etc.), because of the number of 'characters' required, but these can also be encoded using Unicode/UTF-8.

Author Closing Comment

by:Health Payment Systems
ID: 41899251
Thank you DansDad and John. This helped us decide what needed to be done.
LVL 97

Expert Comment

by:John Hurst
ID: 41899258
You are very welcome and I was happy to help.
LVL 16

Expert Comment

ID: 41899329
I've just noticed a 'typo' in my reply above; I stated:

'Latin Capital Letter A', at U+0041, is encoded in UTF-8 as a single-byte (decimal) 65, or (hexadecimal) A1 value; this is the same as the ASCII character code value.

This should have read:

'Latin Capital Letter A', at U+0041, is encoded in UTF-8 as a single-byte (decimal) 65, or (hexadecimal) 41 value; this is the same as the ASCII character code value.

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to deal with a specific error when using the Enable-RemoteMailbox cmdlet to create a mailbox in the cloud-based service, for an existing user in an on-premises Active Directory.
On September 18, Experts Exchange launched the first installment of the Help Bell, a new feature for Premium Members, Team Accounts, and Qualified Experts. The Help Bell will serve as an additional tool to help teams increase question visibility.
In this video we show how to create an Address List in Exchange 2013. We show this process by using the Exchange Admin Center. Log into Exchange Admin Center.: First we need to log into the Exchange Admin Center. Navigate to the Organization >> Ad…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

718 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question