Link to home
Start Free TrialLog in
Avatar of Razzmataz73
Razzmataz73

asked on

Funky Characters coming back in PDO Query

Hello,
I have 2 server.
1 is a Windows server and houses our ACT! Database.  The ACT program does not let us edit any of the tables but it gives us view access so we can query it from our websites.

The 2nd one is our Linux Web server.

When I query the database (in php from the web server to the windows server) some of the items come back looking odd like:
B#¿¿,I¿+¿¿"H

It seems to only be for the ID fields (CONTACTID, etc....)

I have done some searching on the web and have tried changing the charset to the php.ini to utf8.  And adding the charset to my connection string but nothing seems to work.

I also found this EE article:
https://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11880-Unicode-PHP-and-Character-Collisions.html

Which I am going through but I still get the funky characters.

Any ideas?

Here is a link to my phpinfo:  http://visailing.com/admin/phpinfo.php

Here is a link to the live page:
http://www.visailing.com/clientportal/search/test2.php

Here is my code:
$searchstring = "Test";
### QUERY ACT! DATABASE ###
try {
    $hostname = "hostgoeshere";
	$dbname = "dbname";
    $username = "ACTREADER";
    $pw = "password";
    $dbh = new PDO ("dblib:host=$hostname;dbname=$dbname;charset=UTF-8","$username","$pw");
  } catch (PDOException $e) {
    echo "New  Error: " . $e->getMessage() . "\n";
    exit;
  }
 $runsql = $dbh->prepare("SELECT * FROM [dbo].[TBL_CONTACT] WHERE LASTNAME = '$searchstring'");
  $stmt = $runsql;
  $stmt->execute();
  while ($row = $stmt->fetch()) {
	   print_r ($row);
  }

Open in new window

Avatar of Scott Fell
Scott Fell
Flag of United States of America image

I think you answered your own question with Ray's article.
Here's Ray's article on character sets: https://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11880-Unicode-PHP-and-Character-Collisions.html  The short summary is that every page and transmission along the path must be using the same character set and declaration.
Totally agree with Scott and Dave: it look like you think reading a tutorial is enough to make the code work!

In the Ray article you say to have read there is a paragraph titled 'Using UTF-8 with PDO': reading your snippet above it's evident you're not using Ray's tecnique to deal with uitf8 and PDO:

you don't use lines 31-33:
$mbie = "utf-8";
mb_internal_encoding($mbie);
mb_http_output($mbie);

nor line 38:
$dsn = "mysql:host=$db_host;dbname=$db_name;charset=utf8mb4";
($dns i used then to create the PDO object:  $pdo = new PDO($dsn, $db_user, $db_word);)

nor line 50:

$pdo->setAttribute( PDO::MYSQL_ATTR_INIT_COMMAND, 'SET NAMES utf8mb4');

Use Ray's suggestions: they'll fix your problem wothout any doubt.

Cheers
Have you at least set the table character set to utf-8?
Your phpinfo() does not indicate any particular problem, although it shows PHP 5.3 which is a back-level release (you should be at PHP 5.4+ or PHP 5.5+).  In my php.ini there is no setting for default_charset, and to understand character encoding in PHP you need to understand something of the history of PHP.  It started out with the assumption that a byte == a character. This is a valid assumption in US ASCII, ISO-8859-1, and Windows 1252, which were the most popular character sets in the 1990's.  But times have changed and so have character set encodings.  You can no longer assume that one byte == one character; characters can be made up from more than one byte in today's most popular encoding, which is UTF-8.

So much for UTF-8 (you have the link to the article, and if you do what it says your scripts will handle UTF-8 data correctly).  My question is, "Why do you think this is UTF-8?"

It looks to me (not 100% sure, but maybe a good guess) like the ACT! data base uses those funny-character fields as relational keys. In other words, they are not expected to be printable character strings; they are used as binary keys to connect and query among the tables.  Suggest you try treating them as binary fields or multi-byte strings but not a printable representations of anything - just keys.

I think this reference might be helpful:
http://cicorp.com/act/sdk/ACT6-SDK-ChapterA.htm

To be sure, your browser is not telling you the "truth" about the character in these keys.  If you want to learn exactly what they contain on a byte-by-byte basis, you can try something like this script.

<?php // hexdump.php
error_reporting(E_ALL);
echo '<pre>';

/*
 * Expand and display a variable in hexadecimal notation
 *
 * @param string $str The variable to expand and display
 * @return none (direct output)
 */
function hexdump($str, $br=PHP_EOL)
{
    if (empty($str)) return FALSE;

    // GET THE HEX BYTE VALUES IN A STRING
    $hex = str_split(implode(NULL, unpack('H*', $str)));

    // ALLOCATE BYTES INTO HI AND LO NIBBLES
    $hi  = NULL;
    $lo  = NULL;
    $mod = 0;
    foreach ($hex as $nib)
    {
        $mod++;
        $mod = $mod % 2;
        if ($mod)
        {
            $hi .= $nib;
        }
        else
        {
            $lo .= $nib;
        }
    }

    // SHOW THE SCALE, THE STRING AND THE HEX
    $num = substr('1...5...10...15...20...25...30...35...40...45...50...55...60...65...70...75...80...85...90...95..100..105..110..115..120..125..130', 0, strlen($str));
    echo $br . $num;
    echo $br . $str;
    echo $br . $hi;
    echo $br . $lo;
    echo $br;
}

hexdump($_GET['q']);

Open in new window

Afterthought...

Upon looking at the output of this script, I see that it is fetching all of the data twice (the least efficient data retrieval possible).  The default settings for fetch_style flags in PDOStatement::fetch() are inappropriate.

You probably want to correct that; choose FETCH_OBJ or FETCH_ASSOC.  It's a common mistake that crops up in old versions of PHP code from a decade ago (like PHP3 and PHP4) and sometimes appears even today when a programmer did not understand what he was doing, but just copied some code found on the internet.  Don't copy code found on the internet!

The correct way to run queries and retrieve data can be found in the examples here:
https://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/A_11177-PHP-MySQL-Deprecated-as-of-PHP-5-5-0.html

If you're new to PHP and want to get a good foundation in the basics, this article can help you find good learning resources, and more importantly, avoid the many bad examples that litter the interwebs.
https://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11769-And-by-the-way-I-am-new-to-PHP.html

Best of luck with your project, ~Ray
Avatar of Razzmataz73
Razzmataz73

ASKER

Thank you all for your answers.
And I understand your frustration with me and these types of questions.
I have been trying to get this particular client to upgrade their PHP version for going on 2 years now and will try again with this issue as another example of why it would be extremely beneficial.

Yes,
Ray you are correct.
Based on my searches (mostly stackoverflow posts of old issues that popped up based on my query) I was guessing that it was utf-8 and I do not know that for certain.

I did not do my due diligence and was at the beginning of researching this issue when Ray's article popped up in my search results and outlined the exact type of behavior I was seeing.  I tried to implement some of the code and it didn't resolve the issue.  And I got lazy and thought hey, I have an EE account why don't I just ask them?

I apologize for that.

I am going to look more deeply into your responses now and try out your suggestions and will update this post once I have more information.

Thank you again for taking the time to review this question.
No apology needed at all!  We're just trying to help you get "up-to-speed" on this stuff.  It's a lot to learn :-)
I'm sorry: effectively, my answer has been a bit rude. This only because I was tired; i didn't sleep well and I knew I would have had a bad day!
Next time I'll be in a huff I'll avoid to post answers here :-)

Good luck with you project!
Marco
No problem Marco,
I didn't take it as rude at all you are 100% right.
I just skimmed the article and applied what I thought I needed but I didn't truly sit down and read the whole thing and try to understand the concept behind it.

You lit a fire under me to understand the concepts behind it and Ray is providing the knowledge base so I appreciate what you both are doing.

I am looking into your suggestions now.
Ray,
I have your hex script running at the top of:
http://www.visailing.com/clientportal/search/test2.php

It is still pulling back those characters.
One the database side the IDs are fomatted like "8b184563-fd5f-4579-b317-00007f642975" (I only have read access to the database, I can't make any changes to it).

Based on what you are saying it looks like it is not a charset issue and instead I need to look into how to do a php mysql query on how to do a select with a binary field.

Does that sound correct/will put me on the right path?
Or am I still not understanding and I am going in the wrong direction?
I am willing and excited to do the work/research on this, but I just want to make sure I am heading in the right direction.
Hex script?  I'm not seeing anything except the printout of the data base record.
I had implimented your script from above allong with the suggestion to use PDO::FETCH_ASSOC so it wasn't printing multiple copies.

But I probably messed that up.

This is what the test code now looks like:

error_reporting(E_ALL);
echo '<pre>';

/*
 * Expand and display a variable in hexadecimal notation
 *
 * @param string $str The variable to expand and display
 * @return none (direct output)
 */
function hexdump($str, $br=PHP_EOL)
{
    if (empty($str)) return FALSE;

    // GET THE HEX BYTE VALUES IN A STRING
    $hex = str_split(implode(NULL, unpack('H*', $str)));

    // ALLOCATE BYTES INTO HI AND LO NIBBLES
    $hi  = NULL;
    $lo  = NULL;
    $mod = 0;
    foreach ($hex as $nib)
    {
        $mod++;
        $mod = $mod % 2;
        if ($mod)
        {
            $hi .= $nib;
        }
        else
        {
            $lo .= $nib;
        }
    }

    // SHOW THE SCALE, THE STRING AND THE HEX
    $num = substr('1...5...10...15...20...25...30...35...40...45...50...55...60...65...70...75...80...85...90...95..100..105..110..115..120..125..130', 0, strlen($str));
    echo $br . $num;
    echo $br . $str;
    echo $br . $hi;
    echo $br . $lo;
    echo $br;
}

hexdump($_GET['q']);

$searchstring = "....";
### QUERY ACT! DATABASE ###
try {
    $hostname = "....";
	$dbname = "....";
    $username = "ACTREADER";
    $pw = "....";
    $dbh = new PDO ("dblib:host=$hostname;dbname=$dbname","$username","$pw");
  } catch (PDOException $e) {
    echo "New  Error: " . $e->getMessage() . "\n";
    exit;
  }
 $runsql = $dbh->prepare("SELECT * FROM [dbo].[TBL_CONTACT] WHERE LASTNAME = '$searchstring'");

  $stmt = $runsql;
  $stmt->execute();
  
/* Exercise PDOStatement::fetch styles */
print("PDO::FETCH_ASSOC: ");
print("Return next row as an array indexed by column name\n");
$result = $stmt->fetch(PDO::FETCH_ASSOC);
print_r($result);
print("\n");

Open in new window



I will keep digging.

Does looking into a hex or binary query vs charset sound like the right path though?
Don't keep digging, you're already in a hole.  Instead, stop what you're doing and take some time to learn the basics of PHP.  Learning by trial and error is like trying to learn how to bake by looking at an apple pie.  The pie won't tell you what you need to know and the PHP code won't tell you either.  The Welling-Thompson book is a good place to start.

I'll take one more crack at this.  Please try this and let's see if we can see the hexdump() output applied to the CONTACTID field.

<?php
error_reporting(E_ALL);
echo '<pre>';

/*
 * Expand and display a variable in hexadecimal notation
 *
 * @param string $str The variable to expand and display
 * @return none (direct output)
 */
function hexdump($str, $br=PHP_EOL)
{
    if (empty($str)) return FALSE;

    // GET THE HEX BYTE VALUES IN A STRING
    $hex = str_split(implode(NULL, unpack('H*', $str)));

    // ALLOCATE BYTES INTO HI AND LO NIBBLES
    $hi  = NULL;
    $lo  = NULL;
    $mod = 0;
    foreach ($hex as $nib)
    {
        $mod++;
        $mod = $mod % 2;
        if ($mod)
        {
            $hi .= $nib;
        }
        else
        {
            $lo .= $nib;
        }
    }

    // SHOW THE SCALE, THE STRING AND THE HEX
    $num = substr('1...5...10...15...20...25...30...35...40...45...50...55...60...65...70...75...80...85...90...95..100..105..110..115..120..125..130', 0, strlen($str));
    echo $br . $num;
    echo $br . $str;
    echo $br . $hi;
    echo $br . $lo;
    echo $br;
}



$searchstring = "....";
### QUERY ACT! DATABASE ###
try {
    $hostname = "....";
    $dbname = "....";
    $username = "ACTREADER";
    $pw = "....";
    $dbh = new PDO ("dblib:host=$hostname;dbname=$dbname","$username","$pw");
} 
catch (PDOException $e) {
    echo "New Error: " . $e->getMessage() . PHP_EOL;
    exit;
}
$runsql = $dbh->prepare("SELECT * FROM [dbo].[TBL_CONTACT] WHERE LASTNAME = '$searchstring'");
$runsql->execute();
  
/* Exercise PDOStatement::fetch styles */
print("PDO::FETCH_OBJECT: ");
print("Return next row as an object with properties by column name" . PHP_EOL);
$result = $stmt->fetch(PDO::FETCH_OBJECT);
print_r($result);


echo PHP_EOL;
hexdump($result->CONTACTID);

Open in new window

Ray, just as a curiosity, why do you keep recommending a book that won't be available for another 5 months?
Publication Date: September 3, 2014
The 4th edition (from 2008) is here: http://www.amazon.com/PHP-MySQL-Web-Development-4th/dp/0672329166/ref=sr_1_1?s=books&ie=UTF8&qid=1396811057&sr=1-1&keywords=PHP+and+MySQL+Web+Development+%284th+Edition%29
Dan, I think they've pushed the pub date back a few times.  Even the older version would be better than nothing!  And when the new one come out, you can get that and give the old version to someone you don't like :-)
Thank you Ray!!!!
The code you posted gave me enough to start researching and with a little tweaking I got it to work.

I am still researching why I couldn't get the above hexdump code to work for me (might be the old version of PHP, not saying that is it, just a thought/starting point idea before I look into it some more) but I got bin2hex to do it.
string bin2hex ( string $str ) as on http://www.php.net/manual/en/function.bin2hex.php.

And the code:
echo "CONTACT ID:  ".bin2hex($result["CONTACTID"]);

Now shows the correct information.

Thank you all for your help and getting me on the right path for this issue.
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial