json_encode fails with special characters

I've got the following piece of code:

		$query = "SELECT * 
				  FROM user
				  ORDER BY userFullName ASC";
		$result = mysqli_query($con, $query);
		$rows = array();
		while ($r = mysqli_fetch_assoc($result)) {
			$rows[] = $r;
		}
		header('Content-Type: application/json');
        echo json_encode($rows);

Open in new window


One of the users has a special character on his name ( é ) and when that's in there - the json_encode fails. How can one go around this and get a good output anyway?
MrChrisDavidsAsked:
Who is Participating?
 
Ray PaseurCommented:
It's not enough to delete and recreate the table.  You must use the correct encoding to create the data and to store it in the table.  You must use the correct encoding to retrieve the data from the table and create the JSON string.  It's an end-to-end process with more than one step -- there is not any such thing as a single solution.

PHP was written without any thought to multi-byte characters.  Obviously the world is simultaneously smaller and larger today than it was 20 years ago.  I wish there was some kind of "magic wand" solution, but there is not - your character encoding must be correct from the time the data is created in your program, through the time it is stored in your database, through the time it is retrieved from the data base, through the time it is used to create your JSON string, through the time the JSON is sent in a response to a request. In some cases it is necessary to change your programming so that your scripts handle the character encoding correctly.

And as if to "pile on" your version of PHP is in play!  Data created in one version may be incompatible with another version.  Sorry it's not simple, but I tried to make everything as clear as possible in the article.
0
 
Chris StanyonCommented:
Can you explain what you mean by 'fails' - do you get an error message (you may need to turn on error_reporting. Do you know what encoding your database is set to (it should be UTF!)

As an aside, you don't need to loop through the array just to create an array:

header('Content-Type: application/json');
$result = mysqli_query($con, $query);
$rows = mysqli_fetch_all($result, MYSQLI_ASSOC);
echo json_encode($rows);

Open in new window

0
 
Marco GasiFreelancerCommented:
First, you must be sure all your scripts are utf-8 encoded; second, you must be sure to have all your database columns encoded with utf-8; third, you have to put this line in your html head section:
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"> 

Open in new window

This should fix the problem, if I'm not forgetting something :-)
0
Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

 
MrChrisDavidsAuthor Commented:
The DB's set to UTF8 encoding.

Error reporting is on but echo json_encode($rows); simply doesn't output anything at all.

If I dump the array, the content's in there.

Adding the meta tag would imply that my output's a HTML page, no? I want just JSON thus the header line within the script. Or am I going about it wrong?
0
 
Marco GasiFreelancerCommented:
The meta tag only if you get to display in a html page. But I don't understand a thing: you say that the script doesn't output anything, but then you say that if you dump (suppose using var_dump or print_r) you see the content there. How are you trying to output you results?
0
 
MrChrisDavidsAuthor Commented:
Sorry - guess I unclear on that part. If I dump out the array: var_dump($rows); I get all the content which is supposed to be JSON encoded. Then when the echo_json_encode($rows); run, nothing happens.

The output's supposed to go to the page - but as a JSON page, not HTML.
0
 
Chris StanyonCommented:
Include the encoding in your header:

header('Content-Type: application/json; charset=utf-8');

Open in new window

0
 
Marco GasiFreelancerCommented:
If you run php 5.4+ you can try to use this:

echo json_encode($rows, JSON_UNESCAPED_UNICODE);

Open in new window

0
 
Marco GasiFreelancerCommented:
Otherwise, you can use this function:
function jsonRemoveUnicodeSequences( $struct )
{
	return preg_replace( "/\\\\u([a-f0-9]{4})/e", "iconv('UCS-4LE','UTF-8',pack('V', hexdec('U$1')))", json_encode( $struct ) );
}
echo jsonRemoveUnicodeSequences($rows);

Open in new window

1
 
Ray PaseurCommented:
This is a common issue.  JSON requires UTF-8.  An explanation and the solutions are in this article.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11880-Unicode-PHP-and-Character-Collisions.html
0
 
MrChrisDavidsAuthor Commented:
Adding charset to the content-type didn't work.

I'm running PHP 5.6.3, but JSON_UNESCAPED_UNICODE didn't work.

Using the function returned the following:
<b>Deprecated</b>:  preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in <b>D:\sites\incident\www\data\getusers.php</b> on line <b>34</b><br />

Open in new window


Tried reading the article but couldn't find any immediately applicable solution? Maybe I missed it?
0
 
Chris StanyonCommented:
As the error states, the /e switch of preg_replace is deprecated.

Try to var_sump the $rows array, and then assign the result of json_encode to a variable and var_dump that as well:

var_dump($rows);
$json  = json_encode($rows);
var_dump($json);

Open in new window


See what that turns up
0
 
Ray PaseurCommented:
You probably would want to read the entire article.  It's not a one-size-fits-all situation.  You risk losing information if you do the conversion wrong.  Your choices about how to do the conversion will depend on your particular circumstances.  The article describes your choices, what the JSON standard requires, how to find offending characters, what will happen to your data base records, etc.  The programming and processes are fairly-well vetted by the DC PHP community.  Here's the slide deck if you want the TL;DR version.  The article has cut-and-paste code examples.
http://www.slideshare.net/RayPaseur/unicode-php-and-character-set-collisions
0
 
MrChrisDavidsAuthor Commented:
Dumping the array, json etc.. :)

        var_dump($rows);
        $json = json_encode($rows);
        var_dump($json);

Open in new window


Results in ..

array(3) {
  [0]=>
  array(3) {
    ["userId"]=>
    string(1) "1"
    ["userName"]=>
    string(6) "chrwir"
    ["userFullName"]=>
    string(15) "Christian Wirén"
  }
  [1]=>
  array(3) {
    ["userId"]=>
    string(1) "3"
    ["userName"]=>
    string(6) "marjoh"
    ["userFullName"]=>
    string(16) "Martin Johansson"
  }
  [2]=>
  array(3) {
    ["userId"]=>
    string(1) "2"
    ["userName"]=>
    string(6) "maxjon"
    ["userFullName"]=>
    string(11) "Max Jonborn"
  }
}
bool(false)

Open in new window

0
 
Chris StanyonCommented:
Strange - it works perfectly for me with your data.

Add a call to json_last_error() after your json_encode call, and then take a look at the man page to see if that gives any further clues:

var_dump($rows);
$json  = json_encode($rows);
var_dump($json);
echo json_last_error();

Open in new window


Man Page: http://php.net/manual/en/function.json-last-error.php
0
 
Ray PaseurCommented:
What character set what in use at the time this string was created?

Christian Wirén

I ask because it appears to not be valid UTF-8.  You can see the hexdump output here.  JSON requires valid UTF-8.
http://iconoun.com/demo/temp_mrchrisdavids.php
0
 
MrChrisDavidsAuthor Commented:
The encoding for the table is utf8_general_ci.

I've gone as far as to delete the table and re-creating it (just to make sure it's the correct encoding) but I'm still getting the same error; JSON error (5): Malformed UTF-8 characters (which I found using the sugested json_last_error
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.