Solved

PHP displaying UTF-8 encoded characters

Posted on 2011-09-21
8
392 Views
Last Modified: 2012-05-12
So, this is probably a simple question, but I must be missing something. I successfully save UTF-8 encoded Chinese characters to a mysql database.

For example, they end up looking like this in the field ( This is random text taken from a google search, so I do not know what it means )
汉语/漢語

Open in new window


If I simply display it, it works fine. However, all my form values get htmlspecialchars treatment, and when this is done it ends up changing the & to & and displays the text as above and not as its corresponding Chinese character. There doesn't seem to be an additional step in any of the instruction I can find on dealing with these characters, so curious if I'm missing something simple.

I can of course "fix" it by replacing &# with &# after the htmlspecialchars call, I'd prefer to just know what I'm doing wrong though. Thanks!
0
Comment
Question by:WhistlingMtn
  • 4
  • 3
8 Comments
 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573365
I belive the &amp is actually the '&' itself...

Have you tried just using: &27721;&35821;

?
0
 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573397
Yeah I just tried that within my Joomla installation.  The text editor is filtering the & and changing it to &.

The way you supplied the &#27721 into the content is how you can trick the system...

This is what I got using &27721;

&27721;

This is what I got using &#27721:
¿

The issue here is if you are using a text editor or not...as it's filtering your code and changing it on you...you can try turning it off and see what happens.

Looks like you are doing it right.
0
 

Author Comment

by:WhistlingMtn
ID: 36573413
well, & is the encoded version of &

I don't have a choice on what they're ending up as, they're getting encoded by mysql to UTF-8. The problem would still be the same though;
汉 and 汉 are not the same thing

<input type="text" value="&#27721;" /> Displays the Chinese Character
<input type="text" value="&amp;#27721;" /> Displays the literal "&#27721;" text

Open in new window


I can pick out the &amp;# and convert it back to &#, but having viewed examples online I didn't see anyone else requiring this, they just got their encoded text, htmlspecialchars, and display. Maybe I just misunderstood them.
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573419
Whoa...even here they are using a text filter...it changed the character to an upside down question mark...here's a screen shot of what it looks like:

 char
0
 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573426
Ah, I see what you mean...
0
 

Author Comment

by:WhistlingMtn
ID: 36573430
I may just replace all &amp; back to &, since it's not a dangerous character in a text field anyway. Just perplexed as to why I'm having to do this when the dozens of threads online make no mention of it.
0
 
LVL 109

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 36595241
I am wondering about this part: all my form values get htmlspecialchars treatment -- why?  The usual place one might use htmlspecialchars() is to prevent user-supplied text from containing HTML markup in a message board or guest book.  Thus it would not apply to all form values, but would be used on external text before displaying the text output to the browser.  In any case, there are only five translations performed by the function, so you might try performing four of them yourself in a local function.
0
 

Author Closing Comment

by:WhistlingMtn
ID: 36595259
Yea I should have closed the question, this was basically my solution.
0

Featured Post

Active Directory Webinar

We all know we need to protect and secure our privileges, but where to start? Join Experts Exchange and ManageEngine on Tuesday, April 11, 2017 10:00 AM PDT to learn how to track and secure privileged users in Active Directory.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
The viewer will learn how to dynamically set the form action using jQuery.
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question