Link to home
Start Free TrialLog in
Avatar of R7AF
R7AFFlag for Netherlands

asked on

Character encoding problems

I'm working on a website that has a nasty problem. Some pages use UTF-8, other ISO-8859-1. Data is entered in the database using both methods. Now it turns out that for (at least) one of those pages the encoding is changed, resulting in data in the table that is UTF-8 for some records, and ISO-8859-1 for other. The database is set to 8859-1.

Is it possible to detect per record which encoding is used? Depending on the encoding I could use a php-function like utf8-encode.
Avatar of gheist
gheist
Flag of Belgium image

What do you mean by pages - 4kB CPU pages/16kB database pages/Webpages
Avatar of R7AF

ASKER

I mean webpages.
here is possibility for dual interpretation - from page headers and from Meta tags....
Avatar of R7AF

ASKER

The data is already entered in the database table. The data is a mix of ISO-8859-1 and UTF-8. When I publish the data on a html page (PHP generated), and set that page to either ISO-8859-1 or UTF-8 (using php page headers or html meta tags), some of those records don't have the correct encoding type. Because I don't know which records that are, I would like to be able to translate the text into another encoding when publishing that text. I would like to know whether it's possible to detect the encoding type based on a text string.
ASKER CERTIFIED SOLUTION
Avatar of gheist
gheist
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial