Solved

PHP displaying UTF-8 encoded characters

Posted on 2011-09-21
8
405 Views
Last Modified: 2012-05-12
So, this is probably a simple question, but I must be missing something. I successfully save UTF-8 encoded Chinese characters to a mysql database.

For example, they end up looking like this in the field ( This is random text taken from a google search, so I do not know what it means )
汉语/漢語

Open in new window


If I simply display it, it works fine. However, all my form values get htmlspecialchars treatment, and when this is done it ends up changing the & to & and displays the text as above and not as its corresponding Chinese character. There doesn't seem to be an additional step in any of the instruction I can find on dealing with these characters, so curious if I'm missing something simple.

I can of course "fix" it by replacing &# with &# after the htmlspecialchars call, I'd prefer to just know what I'm doing wrong though. Thanks!
0
Comment
Question by:WhistlingMtn
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
8 Comments
 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573365
I belive the &amp is actually the '&' itself...

Have you tried just using: &27721;&35821;

?
0
 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573397
Yeah I just tried that within my Joomla installation.  The text editor is filtering the & and changing it to &.

The way you supplied the &#27721 into the content is how you can trick the system...

This is what I got using &27721;

&27721;

This is what I got using &#27721:
¿

The issue here is if you are using a text editor or not...as it's filtering your code and changing it on you...you can try turning it off and see what happens.

Looks like you are doing it right.
0
 

Author Comment

by:WhistlingMtn
ID: 36573413
well, & is the encoded version of &

I don't have a choice on what they're ending up as, they're getting encoded by mysql to UTF-8. The problem would still be the same though;
汉 and 汉 are not the same thing

<input type="text" value="&#27721;" /> Displays the Chinese Character
<input type="text" value="&amp;#27721;" /> Displays the literal "&#27721;" text

Open in new window


I can pick out the &amp;# and convert it back to &#, but having viewed examples online I didn't see anyone else requiring this, they just got their encoded text, htmlspecialchars, and display. Maybe I just misunderstood them.
0
Are You Using the Best Web Development Editor?

The worlds of web hosting and web development are constantly evolving. Every year we see design trends change, coding standards adapt and new frameworks/CMS created. With such a quick pace of change it’s easy to get lost trying to keep up.

See if your editor made the list.

 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573419
Whoa...even here they are using a text filter...it changed the character to an upside down question mark...here's a screen shot of what it looks like:

 char
0
 
LVL 13

Expert Comment

by:Andrew Derse
ID: 36573426
Ah, I see what you mean...
0
 

Author Comment

by:WhistlingMtn
ID: 36573430
I may just replace all &amp; back to &, since it's not a dangerous character in a text field anyway. Just perplexed as to why I'm having to do this when the dozens of threads online make no mention of it.
0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 36595241
I am wondering about this part: all my form values get htmlspecialchars treatment -- why?  The usual place one might use htmlspecialchars() is to prevent user-supplied text from containing HTML markup in a message board or guest book.  Thus it would not apply to all form values, but would be used on external text before displaying the text output to the browser.  In any case, there are only five translations performed by the function, so you might try performing four of them yourself in a local function.
0
 

Author Closing Comment

by:WhistlingMtn
ID: 36595259
Yea I should have closed the question, this was basically my solution.
0

Featured Post

PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

626 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question