• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1080
  • Last Modified:

Encoding and utf8_encode

I'm using Php to read an xml data file and after using fread I'm using utf8_encode to encode the data. The problem is when the data is printed in the browser there are unwanted characters
eg: Â",  Â, ­­Ã²Ã¥Ã°Ã¨Ã®Ã°ÃÃ, Ãðîçîðöè:, é, etc

What is the correct way of haddling this problem please?
0
ncw
Asked:
ncw
  • 4
  • 3
1 Solution
 
steelseth12Commented:
whats the encoding of the xml ?
whats the encoding of the page you are outputting the xml ?
0
 
ncwAuthor Commented:
I don't have much understanding of encoding but Textpad says the raw xml data has a code set of ANSI in the document properties. In IE6 under View -> Encoding I see Auto-Select ticked and Western European (Windows) is selected. If I change it to Unicode (UTF-8) then it looks a little better, but the  is replaced with a small outlined square box.

I think I need it to be compatible with the default Western European encoding.
0
 
ncwAuthor Commented:
The data file has come from Bulgaria and is being read in the UK, maybe the data should be encoded in Bulgaria at source?
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
steelseth12Commented:
The xml file should have the encoding in the document declaration.
e.g
<?xml version="1.0" encoding="utf-8"?>

also in your html put

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

if you already have a content type page change the character set to utf-8 ....

if the xml is also utf-8 then that should do the trick.

if its not then look at <?xml version="1.0" encoding=character_set_here"?> and tell what it is so we can convert it.
0
 
ncwAuthor Commented:
I provided an xml template with <?xml version="1.0" encoding="UTF-8" ?> in the first line, so I expected it to be encoded to ub=unicode but I believe it is ANSI. If I save it as utf-8 in Textpad and don't use utf8_encode then it looks ok. So either fread or utf8_encode is failing to handle the characters?

I will ask the supplier to output with utf-8 encoding, thanks.

0
 
steelseth12Commented:
utf8_encode encodes ISO-8859-1 encoded strings to utf8 if it is any other character set the you need to use iconv to change the encoding.
http://www.php.net/manual/en/function.iconv.php

0
 
ncwAuthor Commented:
Thanks!
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now