Encoding and utf8_encode

I'm using Php to read an xml data file and after using fread I'm using utf8_encode to encode the data. The problem is when the data is printed in the browser there are unwanted characters
eg: Â",  Â, ­­Ã²Ã¥Ã°Ã¨Ã®Ã°ÃÃ, Ãðîçîðöè:, é, etc

What is the correct way of haddling this problem please?
Who is Participating?
steelseth12Connect With a Mentor Commented:
utf8_encode encodes ISO-8859-1 encoded strings to utf8 if it is any other character set the you need to use iconv to change the encoding.

whats the encoding of the xml ?
whats the encoding of the page you are outputting the xml ?
ncwAuthor Commented:
I don't have much understanding of encoding but Textpad says the raw xml data has a code set of ANSI in the document properties. In IE6 under View -> Encoding I see Auto-Select ticked and Western European (Windows) is selected. If I change it to Unicode (UTF-8) then it looks a little better, but the  is replaced with a small outlined square box.

I think I need it to be compatible with the default Western European encoding.
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

ncwAuthor Commented:
The data file has come from Bulgaria and is being read in the UK, maybe the data should be encoded in Bulgaria at source?
The xml file should have the encoding in the document declaration.
<?xml version="1.0" encoding="utf-8"?>

also in your html put

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

if you already have a content type page change the character set to utf-8 ....

if the xml is also utf-8 then that should do the trick.

if its not then look at <?xml version="1.0" encoding=character_set_here"?> and tell what it is so we can convert it.
ncwAuthor Commented:
I provided an xml template with <?xml version="1.0" encoding="UTF-8" ?> in the first line, so I expected it to be encoded to ub=unicode but I believe it is ANSI. If I save it as utf-8 in Textpad and don't use utf8_encode then it looks ok. So either fread or utf8_encode is failing to handle the characters?

I will ask the supplier to output with utf-8 encoding, thanks.

ncwAuthor Commented:
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.