Solved

How to read Unicode / Extended characters on a Mac?

Posted on 2013-11-12
15
525 Views
Last Modified: 2014-01-07
We have a CSV file that is downloaded from users who entered information via iPhones and iPads all over the country. Recently, someone entered their name with a non-ascii character, which is an e with a downwards accent over it.

When we open it up in Excel, it displays as a capital A with umlauts over it and a squiggle. (BOth Windows and Mac versions of Excel have this problem).

On Windows, when I open up the CSV file in notepad++, the character displays properly. On Mac, if I open it up in Textedit, it displays at the square root sign followed by the registered trademark symbol.

The hex values for the character are shown below:
Character Hex Values
How do I get Mac to interpret this correctly?

Here's the goal: if we can SEE what the character is supposed to be, then we can re-type the character in Excel so that it prints properly. Ideally, we'd like to use the characters as they were originally encoded, but we'd settle for a hack where we can view the original in ANY program and then re-type it in Excel.
0
Comment
Question by:DrDamnit
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 4
15 Comments
 
LVL 53

Expert Comment

by:strung
ID: 39642432
The problem is likely that your default font does not include unicode characters. Try changing the font to Lucinda Sans Unicode.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642441
See this page for further information on Unicode and how to view it. http://symbolcodes.tlt.psu.edu/web/unicode.html
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39642702
Changing the font just makes the name in question slightly bigger, but still incorrect. The link you provided gives me the how it works theory, not the "how to make it work" application I need.

I know how it works as demonstrated by the fact I posted the hex values of this two-byte single character. How do I force the Mac to interpret the two bytes as a single character is the question.
0
PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

 
LVL 53

Expert Comment

by:strung
ID: 39642832
I am puzzled.

The hex for é is E9 and for è is E8

certainly not c3 or A8

See Hex to Text converter:  http://www.string-functions.com/hex-string.aspx
Text to Hex converter:  http://www.string-functions.com/string-hex.aspx
0
 
LVL 53

Expert Comment

by:strung
ID: 39642838
C3 = Ã
A8 =  ¨
0
 
LVL 53

Expert Comment

by:strung
ID: 39642847
Can you post a portion of the csv file as an attachment?
0
 
LVL 53

Expert Comment

by:strung
ID: 39642909
This long winded pate seems to suggest the problem may be the encoding that was used to send the csv document:

http://www.cs.tut.fi/~jkorpela/chars.html#encinfo

It is also possible (per the same page) that the mime settings on your web server may have something to do with the problem.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642933
I think I am getting closer to the source of the problem. See this page:

http://www.joelonsoftware.com/articles/Unicode.html

which suggests that the e-mails sent by your users should have had a header specifying which type of unicode was being used. For the standard UTF8, it should look like this:

Content-Type: text/plain; charset="UTF-8"

You might check to see if there is a such a header in the e-mail. If the header specifies a different version of Unicode, say, UTF7, we have to figure out at what point the conversion needs to be made.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642950
Aha! If you open TextEdit and go to File > Open, navigate to your csv file, but before you open it, click on the text encoding box, which is, by default, set to Automatic. Click on Automatic and choose a different encoding to see if you can find one that works.

My guess is the the user who sent the file either sent it without a unicode header (as per my previous message) or the header got striped out.
0
 
LVL 53

Accepted Solution

by:
strung earned 500 total points
ID: 39642953
You could also try the freeware TextWrangler instead of  TextEdit to see if it is able to select the right encoding automatically.
0
 
LVL 53

Expert Comment

by:strung
ID: 39642966
<link removed - GaryC123>

Apparently it may misinterpret a comma that is part of a unicode character as a field delimiter.  One suggested solution is to use tabs instead of commas to delimit fields
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39645709
Problem character is in B3.
001894.zip
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 39645714
Some background:

The text is submitted form an iPhone app to the server (data is submitted using an HTTP POST operation, which is saved in MySQL). Later, the CSVs are downloaded directly from the server (we use a SQL query and a php script to build the file). The file submitted above is EXACTLY what is downloaded in its entirety.

The CSV type needs no content header, although, I agree it could possibly fix an encoding issue by forcing it to be viewed properly. But, in this case, I don't see how to do that reliably.
0
 
LVL 53

Expert Comment

by:strung
ID: 39646100
Go to Firefox Preferences > Advanced > Network > Settings

Set proxy settings.

See screenshots attached
Screen-Shot-2013-11-13-at-3.24.1.pdf
Screen-Shot-2013-11-13-at-3.24.2.pdf
0
 
LVL 32

Author Closing Comment

by:DrDamnit
ID: 39762916
I can't remember how we solved this, but it was with some other program. :-)
0

Featured Post

Plug and play, no additional software required!

The ATEN UE3310 USB3.1 Gen1 Extender Cable allows users to extend the distance between the computer and USB devices up to 10 m (33 ft). The UE3310 is a high-quality, cost-effective solution for professional environments such as hospitals, factories and business facilities.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article describes in detail how to set up the iPad in the Enterprise using iPCU aka iPhone Configuration Utility.  This could also be used for the iPhone although I have not detailed out any differences. Preparation as an iPad Administrator:…
A common question or need, when setting-up a new Mac for someone would be to make all of the applications, installed, available from the dock. Many people often do not realize an application is installed unless it is in the dock. Creating a custo…
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …
Monitoring a network: why having a policy is the best policy? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the enormous benefits of having a policy-based approach when monitoring medium and large networks. Software utilized in this v…

622 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question