Link to home
Start Free TrialLog in
Avatar of openaccount1
openaccount1

asked on

Look Up Non-ASCII characters and replace with UTF-8

HI,

We are looking for a script that can scan all non-ascii characters from all our webpages (multiple folders) and replace them to their UTF-8 equivalent. Our web pages are currently in ISO-8859-1  encoding but we found out that the encoding is not consistent and there are other types of encoding in the pages so what we would like to do now is for us to have a script that can scan all Non-ASCII characters and list them out in a text file.

Then a modification of that script that can scan all pages and then replace all the non-ASCII characters to UTF-8 format because we are now changing our encoding to UTF-8.
Avatar of v2Media
v2Media
Flag of Australia image

There are a bunch of charset conversion utilities that are easily found on the web. I did a 2 minute search and found 2 that suit your needs on the first page of results.

What do you want exactly that can't be found with a simple google search?
ASKER CERTIFIED SOLUTION
Avatar of dbrunton
dbrunton
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of openaccount1
openaccount1

ASKER

What I was looking for is a converter  that can look up all non-ASCII characters it does not matter if its in ISO-8859-1 or any other encoding as long as they are changed to UTF-8. Also, is it possible that ISO-8859-1 and UTF-8 encoding are the same? what will be the confllict? Because some characters are non-utf8 characters and may be written differently in UTF-8
solution found