How to read Unicode / Extended characters on a Mac?

We have a CSV file that is downloaded from users who entered information via iPhones and iPads all over the country. Recently, someone entered their name with a non-ascii character, which is an e with a downwards accent over it.

When we open it up in Excel, it displays as a capital A with umlauts over it and a squiggle. (BOth Windows and Mac versions of Excel have this problem).

On Windows, when I open up the CSV file in notepad++, the character displays properly. On Mac, if I open it up in Textedit, it displays at the square root sign followed by the registered trademark symbol.

The hex values for the character are shown below:
Character Hex Values
How do I get Mac to interpret this correctly?

Here's the goal: if we can SEE what the character is supposed to be, then we can re-type the character in Excel so that it prints properly. Ideally, we'd like to use the characters as they were originally encoded, but we'd settle for a hack where we can view the original in ANY program and then re-type it in Excel.
LVL 32
DrDamnitAsked:
Who is Participating?
 
strungCommented:
You could also try the freeware TextWrangler instead of  TextEdit to see if it is able to select the right encoding automatically.
0
 
strungCommented:
The problem is likely that your default font does not include unicode characters. Try changing the font to Lucinda Sans Unicode.
0
 
strungCommented:
See this page for further information on Unicode and how to view it. http://symbolcodes.tlt.psu.edu/web/unicode.html
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
DrDamnitAuthor Commented:
Changing the font just makes the name in question slightly bigger, but still incorrect. The link you provided gives me the how it works theory, not the "how to make it work" application I need.

I know how it works as demonstrated by the fact I posted the hex values of this two-byte single character. How do I force the Mac to interpret the two bytes as a single character is the question.
0
 
strungCommented:
I am puzzled.

The hex for é is E9 and for è is E8

certainly not c3 or A8

See Hex to Text converter:  http://www.string-functions.com/hex-string.aspx
Text to Hex converter:  http://www.string-functions.com/string-hex.aspx
0
 
strungCommented:
C3 = Ã
A8 =  ¨
0
 
strungCommented:
Can you post a portion of the csv file as an attachment?
0
 
strungCommented:
This long winded pate seems to suggest the problem may be the encoding that was used to send the csv document:

http://www.cs.tut.fi/~jkorpela/chars.html#encinfo

It is also possible (per the same page) that the mime settings on your web server may have something to do with the problem.
0
 
strungCommented:
I think I am getting closer to the source of the problem. See this page:

http://www.joelonsoftware.com/articles/Unicode.html

which suggests that the e-mails sent by your users should have had a header specifying which type of unicode was being used. For the standard UTF8, it should look like this:

Content-Type: text/plain; charset="UTF-8"

You might check to see if there is a such a header in the e-mail. If the header specifies a different version of Unicode, say, UTF7, we have to figure out at what point the conversion needs to be made.
0
 
strungCommented:
Aha! If you open TextEdit and go to File > Open, navigate to your csv file, but before you open it, click on the text encoding box, which is, by default, set to Automatic. Click on Automatic and choose a different encoding to see if you can find one that works.

My guess is the the user who sent the file either sent it without a unicode header (as per my previous message) or the header got striped out.
0
 
strungCommented:
<link removed - GaryC123>

Apparently it may misinterpret a comma that is part of a unicode character as a field delimiter.  One suggested solution is to use tabs instead of commas to delimit fields
0
 
DrDamnitAuthor Commented:
Problem character is in B3.
001894.zip
0
 
DrDamnitAuthor Commented:
Some background:

The text is submitted form an iPhone app to the server (data is submitted using an HTTP POST operation, which is saved in MySQL). Later, the CSVs are downloaded directly from the server (we use a SQL query and a php script to build the file). The file submitted above is EXACTLY what is downloaded in its entirety.

The CSV type needs no content header, although, I agree it could possibly fix an encoding issue by forcing it to be viewed properly. But, in this case, I don't see how to do that reliably.
0
 
strungCommented:
Go to Firefox Preferences > Advanced > Network > Settings

Set proxy settings.

See screenshots attached
Screen-Shot-2013-11-13-at-3.24.1.pdf
Screen-Shot-2013-11-13-at-3.24.2.pdf
0
 
DrDamnitAuthor Commented:
I can't remember how we solved this, but it was with some other program. :-)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.