Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

How to read Unicode / Extended characters on a Mac?

Posted on 2013-11-12
15
Medium Priority
?
531 Views
Last Modified: 2014-01-07
We have a CSV file that is downloaded from users who entered information via iPhones and iPads all over the country. Recently, someone entered their name with a non-ascii character, which is an e with a downwards accent over it.

When we open it up in Excel, it displays as a capital A with umlauts over it and a squiggle. (BOth Windows and Mac versions of Excel have this problem).

On Windows, when I open up the CSV file in notepad++, the character displays properly. On Mac, if I open it up in Textedit, it displays at the square root sign followed by the registered trademark symbol.

The hex values for the character are shown below:
Character Hex Values
How do I get Mac to interpret this correctly?

Here's the goal: if we can SEE what the character is supposed to be, then we can re-type the character in Excel so that it prints properly. Ideally, we'd like to use the characters as they were originally encoded, but we'd settle for a hack where we can view the original in ANY program and then re-type it in Excel.
0
Comment
Question by:DrDamnit
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 4
15 Comments
 
LVL 53

Expert Comment

by:strung
ID: 39642432
The problem is likely that your default font does not include unicode characters. Try changing the font to Lucinda Sans Unicode.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642441
See this page for further information on Unicode and how to view it. http://symbolcodes.tlt.psu.edu/web/unicode.html
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39642702
Changing the font just makes the name in question slightly bigger, but still incorrect. The link you provided gives me the how it works theory, not the "how to make it work" application I need.

I know how it works as demonstrated by the fact I posted the hex values of this two-byte single character. How do I force the Mac to interpret the two bytes as a single character is the question.
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 53

Expert Comment

by:strung
ID: 39642832
I am puzzled.

The hex for é is E9 and for è is E8

certainly not c3 or A8

See Hex to Text converter:  http://www.string-functions.com/hex-string.aspx
Text to Hex converter:  http://www.string-functions.com/string-hex.aspx
0
 
LVL 53

Expert Comment

by:strung
ID: 39642838
C3 = Ã
A8 =  ¨
0
 
LVL 53

Expert Comment

by:strung
ID: 39642847
Can you post a portion of the csv file as an attachment?
0
 
LVL 53

Expert Comment

by:strung
ID: 39642909
This long winded pate seems to suggest the problem may be the encoding that was used to send the csv document:

http://www.cs.tut.fi/~jkorpela/chars.html#encinfo

It is also possible (per the same page) that the mime settings on your web server may have something to do with the problem.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642933
I think I am getting closer to the source of the problem. See this page:

http://www.joelonsoftware.com/articles/Unicode.html

which suggests that the e-mails sent by your users should have had a header specifying which type of unicode was being used. For the standard UTF8, it should look like this:

Content-Type: text/plain; charset="UTF-8"

You might check to see if there is a such a header in the e-mail. If the header specifies a different version of Unicode, say, UTF7, we have to figure out at what point the conversion needs to be made.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642950
Aha! If you open TextEdit and go to File > Open, navigate to your csv file, but before you open it, click on the text encoding box, which is, by default, set to Automatic. Click on Automatic and choose a different encoding to see if you can find one that works.

My guess is the the user who sent the file either sent it without a unicode header (as per my previous message) or the header got striped out.
0
 
LVL 53

Accepted Solution

by:
strung earned 2000 total points
ID: 39642953
You could also try the freeware TextWrangler instead of  TextEdit to see if it is able to select the right encoding automatically.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642966
<link removed - GaryC123>

Apparently it may misinterpret a comma that is part of a unicode character as a field delimiter.  One suggested solution is to use tabs instead of commas to delimit fields
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39645709
Problem character is in B3.
001894.zip
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39645714
Some background:

The text is submitted form an iPhone app to the server (data is submitted using an HTTP POST operation, which is saved in MySQL). Later, the CSVs are downloaded directly from the server (we use a SQL query and a php script to build the file). The file submitted above is EXACTLY what is downloaded in its entirety.

The CSV type needs no content header, although, I agree it could possibly fix an encoding issue by forcing it to be viewed properly. But, in this case, I don't see how to do that reliably.
0
 
LVL 53

Expert Comment

by:strung
ID: 39646100
Go to Firefox Preferences > Advanced > Network > Settings

Set proxy settings.

See screenshots attached
Screen-Shot-2013-11-13-at-3.24.1.pdf
Screen-Shot-2013-11-13-at-3.24.2.pdf
0
 
LVL 32

Author Closing Comment

by:DrDamnit
ID: 39762916
I can't remember how we solved this, but it was with some other program. :-)
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Create a default user profile for Mac OS X 10.7/10.8 Create a user account on OS X that will be a template for every other user of that computer. I usually call it “profile” and make it an administrator account for the time being. 1. Install a…
Set up iPhone and iPad email signatures to always send in high-quality HTML with this step-by step guide.
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …
This is my first video review of Microsoft Bookings, I will be doing a part two with a bit more information, but wanted to get this out to you folks.

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question