Solved

json String Issue

Posted on 2016-11-16
3
53 Views
Last Modified: 2016-11-23
I have a string that is finding random quotes and strange characters that are breaking it.

Example in image, it's some dot in the name. Any php or json command so these won't affect it?  Something to cleanse or slash it out?
Untitled-1.png
0
Comment
Question by:Nathan Riley
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 41890420
JSON must be UTF-8.  See if these articles can help you figure it out.  Also, please post the JSON string here, including the bad character.  I'll decode it for you, and we can see what can be done about it.
https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
https://www.experts-exchange.com/articles/22519/Understanding-JSON-in-PHP-and-JavaScript-Applications.html
0
 
LVL 11

Author Comment

by:Nathan Riley
ID: 41890560
Huh, looks like in the database it is:

MaeL"gorzata

So quotes are messing it up?
0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 41890592
Please clarify... There is no space or other character between Mae and L, right?

The quote mark might be the issue, but we need to see this in a little more detail.  The embedded quote mark should be escaped, according to the JSON standard.  The problem with getting "more detail" is that when you copy and paste, it comes with some assumptions - the article about character encoding explains what is happening.  What you see in the browser is encoded according to the character set of the browser.  What you see in the database is encoded according to the character set of the database.  And any data you create in PHP is encoded according to the character set in effect at the time the data was created.  To make matters worse, some text editors will coerce the data into their own encoding scheme.  If any of these character sets are mismatched (eg: database has UTF-8 characters, but the browser is using ISO-8859-1) the outcomes are unpredictable.

This means that you cannot depend on what-you-see-is-what-you-get if you're looking at a browser display of the data, or if you're looking at the text once you have copied it into a text editor.

TL;DR All of your character encoding schemes must be consistent from beginning to end.  And if you're using JSON anywhere along the way, that means all of your character encoding schemes must be UTF-8.

If you want to pursue this further, please show us a link that will give us a way to read the information directly into a program, without going through a copy / paste or browser display.  You might be able to dump that row from the database into data with var_export().  Or you might be able to copy it into a flat file, so you can post a link here.  If we can get the information, we can show you how to break it apart into its byte-by-byte representations, and into its character-by-character representations.  Once upon a time, a byte == a character, but the world has changed and this is not true any more.

This is the sort of script I would use to examine the character encoding and the hex byte values.
<?php // demo/hexdump_unicode_v.php
/**
 * Expand and display a string variable in hexadecimal notation
 * Note: Output will make more sense with a unispace font!
 * http://php.net/manual/en/function.mb-split.php#99851
 *
 * http://iconoun.com/demo/hexdump_unicode_v.php?q=Data:%E7%81%AB%E8%BD%A6%E7%A5%A8!
 *
 * Useful: http://www.utf8-chartable.de/unicode-utf8-table.pl?start=1536&number=1024&utf8=0x&unicodeinhtml=hex
 * Refer2: https://www.experts-exchange.com/articles/11880/Unicode-and-Character-Collisions.html
 *
 * @param string $str The variable to expand and display
 * @return none (direct browser output)
 */
error_reporting(E_ALL);

// SET UP PHP TO USE UTF-8
mb_internal_encoding('UTF-8');
mb_regex_encoding('UTF-8');


Class Letter
{
    public function __construct($chr)
    {
        $this->chr = $chr;
        $this->hex = array();
        $bytes     = $this->usplit($chr);
        foreach ($bytes as $byte)
        {
            $this->hex = array_merge($this->hex, $this->gethex($byte));
        }
        return $this;
    }

    public function usplit ($chr)
    {
        $len = strlen($chr);
        while ($len) {
            $arr[] = substr($chr, 0, 1);
            $chr   = substr($chr, 1, $len);
            $len   = strlen($chr);
        }
        return $arr;
    }

    public function gethex($chr)
    {
        // GET THE HEX NIBBLE VALUES IN AN ARRAY
        $ret = str_split(implode(NULL, unpack('H*', $chr)));
        return $ret;
    }
}


Class Hexdump
{
    public function __construct($str)
    {
        $this->str = $str;
        $this->arr = $this->mb_str_split($str);
        $this->len = mb_strlen($str);
        foreach ($this->arr as $uchr)
        {
            $this->dat[] = new Letter($uchr);
        }
        return $this;
    }

    public function mb_str_split($ustr)
    {
        return preg_split('/(?<!^)(?!$)/u', $ustr);
    }

    public function render($br = PHP_EOL)
    {
        echo $br . " Pos   Chr \tHex";

        foreach ($this->dat as $poz => $chr)
        {
            echo $br;
            echo str_pad($poz, 4, ' ', STR_PAD_LEFT);
            echo '    ';
            echo $chr->chr;
            echo " \t";
            echo implode(null, $chr->hex);
        }
        echo $br;
    }
}


// DEMONSTRATE IT WITH THE REQUEST ARGUMENT
echo '<meta charset="utf-8" />';
echo '<pre>';

$q = !empty($_GET['q']) ? $_GET['q'] : 'Vöila';
var_dump($q);

$y = new Hexdump($q);
$y->render();

Open in new window

1

Featured Post

Understanding Linux Permissions

Linux for beginners: How to view the permissions associated with files and directories and also how you can change them.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction JSON is an acronym for JavaScript Object Notation.  It is a text-string data transport mechanism, capable of representing simple or complex data structures in a consistent and easy-to-read manner.  Similar in concept to XML, but more e…
These days socially coordinated efforts have turned into a critical requirement for enterprises.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

627 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question