json_encode fails with special characters

I've got the following piece of code:

		$query = "SELECT * 
				  FROM user
				  ORDER BY userFullName ASC";
		$result = mysqli_query($con, $query);
		$rows = array();
		while ($r = mysqli_fetch_assoc($result)) {
			$rows[] = $r;
		}
		header('Content-Type: application/json');
        echo json_encode($rows);

Open in new window


One of the users has a special character on his name ( é ) and when that's in there - the json_encode fails. How can one go around this and get a good output anyway?
MrChrisDavidsAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Chris StanyonWebDevCommented:
Can you explain what you mean by 'fails' - do you get an error message (you may need to turn on error_reporting. Do you know what encoding your database is set to (it should be UTF!)

As an aside, you don't need to loop through the array just to create an array:

header('Content-Type: application/json');
$result = mysqli_query($con, $query);
$rows = mysqli_fetch_all($result, MYSQLI_ASSOC);
echo json_encode($rows);

Open in new window

0
Marco GasiFreelancerCommented:
First, you must be sure all your scripts are utf-8 encoded; second, you must be sure to have all your database columns encoded with utf-8; third, you have to put this line in your html head section:
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"> 

Open in new window

This should fix the problem, if I'm not forgetting something :-)
0
MrChrisDavidsAuthor Commented:
The DB's set to UTF8 encoding.

Error reporting is on but echo json_encode($rows); simply doesn't output anything at all.

If I dump the array, the content's in there.

Adding the meta tag would imply that my output's a HTML page, no? I want just JSON thus the header line within the script. Or am I going about it wrong?
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

Marco GasiFreelancerCommented:
The meta tag only if you get to display in a html page. But I don't understand a thing: you say that the script doesn't output anything, but then you say that if you dump (suppose using var_dump or print_r) you see the content there. How are you trying to output you results?
0
MrChrisDavidsAuthor Commented:
Sorry - guess I unclear on that part. If I dump out the array: var_dump($rows); I get all the content which is supposed to be JSON encoded. Then when the echo_json_encode($rows); run, nothing happens.

The output's supposed to go to the page - but as a JSON page, not HTML.
0
Chris StanyonWebDevCommented:
Include the encoding in your header:

header('Content-Type: application/json; charset=utf-8');

Open in new window

0
Marco GasiFreelancerCommented:
If you run php 5.4+ you can try to use this:

echo json_encode($rows, JSON_UNESCAPED_UNICODE);

Open in new window

0
Marco GasiFreelancerCommented:
Otherwise, you can use this function:
function jsonRemoveUnicodeSequences( $struct )
{
	return preg_replace( "/\\\\u([a-f0-9]{4})/e", "iconv('UCS-4LE','UTF-8',pack('V', hexdec('U$1')))", json_encode( $struct ) );
}
echo jsonRemoveUnicodeSequences($rows);

Open in new window

1
Ray PaseurCommented:
This is a common issue.  JSON requires UTF-8.  An explanation and the solutions are in this article.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11880-Unicode-PHP-and-Character-Collisions.html
0
MrChrisDavidsAuthor Commented:
Adding charset to the content-type didn't work.

I'm running PHP 5.6.3, but JSON_UNESCAPED_UNICODE didn't work.

Using the function returned the following:
<b>Deprecated</b>:  preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in <b>D:\sites\incident\www\data\getusers.php</b> on line <b>34</b><br />

Open in new window


Tried reading the article but couldn't find any immediately applicable solution? Maybe I missed it?
0
Chris StanyonWebDevCommented:
As the error states, the /e switch of preg_replace is deprecated.

Try to var_sump the $rows array, and then assign the result of json_encode to a variable and var_dump that as well:

var_dump($rows);
$json  = json_encode($rows);
var_dump($json);

Open in new window


See what that turns up
0
Ray PaseurCommented:
You probably would want to read the entire article.  It's not a one-size-fits-all situation.  You risk losing information if you do the conversion wrong.  Your choices about how to do the conversion will depend on your particular circumstances.  The article describes your choices, what the JSON standard requires, how to find offending characters, what will happen to your data base records, etc.  The programming and processes are fairly-well vetted by the DC PHP community.  Here's the slide deck if you want the TL;DR version.  The article has cut-and-paste code examples.
http://www.slideshare.net/RayPaseur/unicode-php-and-character-set-collisions
0
MrChrisDavidsAuthor Commented:
Dumping the array, json etc.. :)

        var_dump($rows);
        $json = json_encode($rows);
        var_dump($json);

Open in new window


Results in ..

array(3) {
  [0]=>
  array(3) {
    ["userId"]=>
    string(1) "1"
    ["userName"]=>
    string(6) "chrwir"
    ["userFullName"]=>
    string(15) "Christian Wirén"
  }
  [1]=>
  array(3) {
    ["userId"]=>
    string(1) "3"
    ["userName"]=>
    string(6) "marjoh"
    ["userFullName"]=>
    string(16) "Martin Johansson"
  }
  [2]=>
  array(3) {
    ["userId"]=>
    string(1) "2"
    ["userName"]=>
    string(6) "maxjon"
    ["userFullName"]=>
    string(11) "Max Jonborn"
  }
}
bool(false)

Open in new window

0
Chris StanyonWebDevCommented:
Strange - it works perfectly for me with your data.

Add a call to json_last_error() after your json_encode call, and then take a look at the man page to see if that gives any further clues:

var_dump($rows);
$json  = json_encode($rows);
var_dump($json);
echo json_last_error();

Open in new window


Man Page: http://php.net/manual/en/function.json-last-error.php
0
Ray PaseurCommented:
What character set what in use at the time this string was created?

Christian Wirén

I ask because it appears to not be valid UTF-8.  You can see the hexdump output here.  JSON requires valid UTF-8.
http://iconoun.com/demo/temp_mrchrisdavids.php
0
MrChrisDavidsAuthor Commented:
The encoding for the table is utf8_general_ci.

I've gone as far as to delete the table and re-creating it (just to make sure it's the correct encoding) but I'm still getting the same error; JSON error (5): Malformed UTF-8 characters (which I found using the sugested json_last_error
0
Ray PaseurCommented:
It's not enough to delete and recreate the table.  You must use the correct encoding to create the data and to store it in the table.  You must use the correct encoding to retrieve the data from the table and create the JSON string.  It's an end-to-end process with more than one step -- there is not any such thing as a single solution.

PHP was written without any thought to multi-byte characters.  Obviously the world is simultaneously smaller and larger today than it was 20 years ago.  I wish there was some kind of "magic wand" solution, but there is not - your character encoding must be correct from the time the data is created in your program, through the time it is stored in your database, through the time it is retrieved from the data base, through the time it is used to create your JSON string, through the time the JSON is sent in a response to a request. In some cases it is necessary to change your programming so that your scripts handle the character encoding correctly.

And as if to "pile on" your version of PHP is in play!  Data created in one version may be incompatible with another version.  Sorry it's not simple, but I tried to make everything as clear as possible in the article.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.