Solved

How to read Unicode / Extended characters on a Mac?

Posted on 2013-11-12
15
499 Views
Last Modified: 2014-01-07
We have a CSV file that is downloaded from users who entered information via iPhones and iPads all over the country. Recently, someone entered their name with a non-ascii character, which is an e with a downwards accent over it.

When we open it up in Excel, it displays as a capital A with umlauts over it and a squiggle. (BOth Windows and Mac versions of Excel have this problem).

On Windows, when I open up the CSV file in notepad++, the character displays properly. On Mac, if I open it up in Textedit, it displays at the square root sign followed by the registered trademark symbol.

The hex values for the character are shown below:
Character Hex Values
How do I get Mac to interpret this correctly?

Here's the goal: if we can SEE what the character is supposed to be, then we can re-type the character in Excel so that it prints properly. Ideally, we'd like to use the characters as they were originally encoded, but we'd settle for a hack where we can view the original in ANY program and then re-type it in Excel.
0
Comment
Question by:DrDamnit
  • 11
  • 4
15 Comments
 
LVL 53

Expert Comment

by:strung
ID: 39642432
The problem is likely that your default font does not include unicode characters. Try changing the font to Lucinda Sans Unicode.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642441
See this page for further information on Unicode and how to view it. http://symbolcodes.tlt.psu.edu/web/unicode.html
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39642702
Changing the font just makes the name in question slightly bigger, but still incorrect. The link you provided gives me the how it works theory, not the "how to make it work" application I need.

I know how it works as demonstrated by the fact I posted the hex values of this two-byte single character. How do I force the Mac to interpret the two bytes as a single character is the question.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642832
I am puzzled.

The hex for é is E9 and for è is E8

certainly not c3 or A8

See Hex to Text converter:  http://www.string-functions.com/hex-string.aspx
Text to Hex converter:  http://www.string-functions.com/string-hex.aspx
0
 
LVL 53

Expert Comment

by:strung
ID: 39642838
C3 = Ã
A8 =  ¨
0
 
LVL 53

Expert Comment

by:strung
ID: 39642847
Can you post a portion of the csv file as an attachment?
0
 
LVL 53

Expert Comment

by:strung
ID: 39642909
This long winded pate seems to suggest the problem may be the encoding that was used to send the csv document:

http://www.cs.tut.fi/~jkorpela/chars.html#encinfo

It is also possible (per the same page) that the mime settings on your web server may have something to do with the problem.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 53

Expert Comment

by:strung
ID: 39642933
I think I am getting closer to the source of the problem. See this page:

http://www.joelonsoftware.com/articles/Unicode.html

which suggests that the e-mails sent by your users should have had a header specifying which type of unicode was being used. For the standard UTF8, it should look like this:

Content-Type: text/plain; charset="UTF-8"

You might check to see if there is a such a header in the e-mail. If the header specifies a different version of Unicode, say, UTF7, we have to figure out at what point the conversion needs to be made.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642950
Aha! If you open TextEdit and go to File > Open, navigate to your csv file, but before you open it, click on the text encoding box, which is, by default, set to Automatic. Click on Automatic and choose a different encoding to see if you can find one that works.

My guess is the the user who sent the file either sent it without a unicode header (as per my previous message) or the header got striped out.
0
 
LVL 53

Accepted Solution

by:
strung earned 500 total points
ID: 39642953
You could also try the freeware TextWrangler instead of  TextEdit to see if it is able to select the right encoding automatically.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642966
<link removed - GaryC123>

Apparently it may misinterpret a comma that is part of a unicode character as a field delimiter.  One suggested solution is to use tabs instead of commas to delimit fields
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39645709
Problem character is in B3.
001894.zip
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39645714
Some background:

The text is submitted form an iPhone app to the server (data is submitted using an HTTP POST operation, which is saved in MySQL). Later, the CSVs are downloaded directly from the server (we use a SQL query and a php script to build the file). The file submitted above is EXACTLY what is downloaded in its entirety.

The CSV type needs no content header, although, I agree it could possibly fix an encoding issue by forcing it to be viewed properly. But, in this case, I don't see how to do that reliably.
0
 
LVL 53

Expert Comment

by:strung
ID: 39646100
Go to Firefox Preferences > Advanced > Network > Settings

Set proxy settings.

See screenshots attached
Screen-Shot-2013-11-13-at-3.24.1.pdf
Screen-Shot-2013-11-13-at-3.24.2.pdf
0
 
LVL 32

Author Closing Comment

by:DrDamnit
ID: 39762916
I can't remember how we solved this, but it was with some other program. :-)
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Suggested Solutions

In this article we discuss how to recover the missing Outlook 2011 for Mac data like Emails and Contacts manually.
Today, still in the boom of Apple, PC's and products, nearly 50% of the computer users use Windows as graphical operating systems. If you are among those users who love windows, but are grappling to keep the system's hard drive optimized, then you s…
This video discusses moving either the default database or any database to a new volume.
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now