We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you a podcast all about Citrix Workspace, moving to the cloud, and analytics & intelligence. Episode 2 coming soon!Listen Now

x

using utf8_encode() on text that is already encoded

hankknight
hankknight asked
on
Medium Priority
1,720 Views
Last Modified: 2012-05-06
What happens if I use utf8_encode() on text that is already in a UTF-8 encoding?  If problems could be caused, how could they be avoided?
Comment
Watch Question

Most Valuable Expert 2011
Author of the Year 2014

Commented:
The man page says <QUOTE>Encodes an ISO-8859-1 string to UTF-8</QUOTE> so I expect that you would mung the data if you did this.  However the user-contributed notes contain a couple of examples on how to detect if a string is UTF-8.  You might use those in advance to know what strings to avoid.

http://us3.php.net/manual/en/function.utf8-encode.php

HTH, ~Ray
Commented:
Hi,

It's possible to double-encode your text this way.

<?PHP
$var = "Hyvää päivää";
for($i=0;$i<4;$i++) {
$var = utf8_encode($var);
echo $var."<br>\n";
}
?>

Returns:
Hyvää päivää!
Hyvää päivää!
Hyvää päivää!
Hyvää päivää!

(Hope this looks as it should in EE).


There is no sure-fire way to salvage any character encoding. The only way you can avoid problems is by planning ahead and knowing what's the encoding for each string you handle. I'd say the most common problem is to double-encode your user-input UTF-8 strings by forgetting to initialize the database connection in UTF-8. For example in MySQL, this would be done by issuing the command "SET NAMES UTF8;".

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Most Valuable Expert 2011
Author of the Year 2014
Commented:

<?php // RAY_utf8_encode.php
 
echo "<pre>";
 
// SOME TEST DATA
$var = "Hyvää päivää";
$arr[] = $var;
 
// ENCODE IT TO DEATH
for($i=0;$i<4;$i++)
{
   $var = utf8_encode($var);
   $arr[] = $var;
}
 
// PRESENT
var_dump($arr);
 
// DECODE
echo "\n";
foreach ($arr as $pointer => $thing)
{
  $thing = utf8_decode($thing);
  var_dump($thing);
  $arr[$pointer] = $thing;
}
 
// DECODE AGAIN
echo "\n";
foreach ($arr as $pointer => $thing)
{
  $thing = utf8_decode($thing);
  var_dump($thing);
  $arr[$pointer] = $thing;
}
 
// DECODE AGAIN
echo "\n";
foreach ($arr as $pointer => $thing)
{
  $thing = utf8_decode($thing);
  var_dump($thing);
  $arr[$pointer] = $thing;
}

Open in new window

Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.