Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1040
  • Last Modified:

Clean XML file of non utf-8 characters

I have a file of XML which I am loading in php using

$xml = simplexml_load_file('test.xml');

foreach ($xml->event as $event) {
    do_something();
}

The XML file starts with <?xml version="1.0" encoding="UTF-8"?> however there are various non UTF-8 characters in there such as umlauts (sp?) etc.

How can I clean up the file and remove the offending characters?

Thanks

Mike
0
hungoveragain
Asked:
hungoveragain
  • 3
2 Solutions
 
Lukasz ChmielewskiCommented:
0
 
hungoveragainAuthor Commented:
Can you please explain how I would insert that into my code?

$xml = iconv("UTF-8", "ISO-8859-1//TRANSLIT", simplexml_load_file('test.xml'));

??

Thanks

Mike
0
 
hernst42Commented:
You can try something like:

$sx = simplexml_lod_string(iconv('ISO-8859-1', 'UTF-8', iconv('UTF-8', "ISO-8859-1//TRANSLIT", file_get_conents('test.xml'))));
0
 
hungoveragainAuthor Commented:
I can't seem to get that working either.

Here is the file

http://xml.betclick.com/odds_en.xml

I need to get that into

$xml = simplexml_load_file('http://xml.betclick.com/odds_en.xml');

However there are multiple characters in there such as é, ä, etc which makes it fall over.

Thanks

Mike
0
 
hungoveragainAuthor Commented:
Managed to do it.

$in = file("http://xml.betclick.com/odds_en.xml");
$out = fopen("today.xml", "w");

foreach ($in as $line) {
      $line = preg_replace('/&(.)(acute|cedil|circ|lig|grave|ring|tilde|uml);/', "$1", $line);
      fputs($out, $line);
}

Thanks

Mike
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now