[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 4618
  • Last Modified:

Oracle SQL to convert from UTF-8 to ISO-8859-1 character set

Dear Experts,

Up until recently all our comments text in a table has been encoded in ISO-8859-1 format.

e.g. this+is+a+comment

Due to a change in an application, some comments are now in the database encoded into UTF-8 format.

e.g. Flew%2Bnorth%2Bover%2BGSK%2Bfactory%2Bmid%2Bmorning

I thought this wouldn't be a problem, as our database can cope with UTF-8 encoding and so can my new module in the application (written in Java). However, there is some legacy code that I don't understand, to do with downloading excel files, that can't cope with these UTF-8 comments.

i.e. the user downloads the excel file and sees the comment as  Flew%2Bnorth%2Bover%2BGSK%2Bfactory%2Bmid%2Bmorning - it is not decoded.

Is there a way of changing all the UTF-8 comments in the database back to ISO-8859-1 format ? I found mention of a CONVERT command, but couldn't get it to work.

thanks for your help!
0
COMPSUPP
Asked:
COMPSUPP
  • 2
  • 2
1 Solution
 
slightwv (䄆 Netminder) Commented:
Using single selects/updates or at an entire database level?

There is a SQL function CONVERT that will do it in a select statement:

http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/functions032.htm#SQLRF00620
0
 
COMPSUPPAuthor Commented:
Hi,

Thanks for your comment. I have now realised the problem is probably that some of the comments have been double encoded into UTF8. This has been caused by people going forwards then backwards in the web application.

Therefore, something like this:
this+is+a+comment%2B   (which I assume is normal UTF8 encoding) becomes this: this%2Bis%2Ba%2Bcomment%2B.   (i.e. I think the pluses have been encoded when they shouldn't have been).

I tried looking at the SQL convert function, but it doesn't work for this.

I now think maybe I need some sort of DECODE function, to turn the double encoded comments back into the correct UTF8 format.

Please let me know if you think I am on the right track, and if there is a way to fix this double encoding problem!

thanks
0
 
slightwv (䄆 Netminder) Commented:
I'm afraid my multi-byte character set knowledge is VERY limited.  I can get by with basic issues.

I think this is a URL style encoding issue more than a UTF encoding (then again, maybe not).

There might be a few ways to unencode a URL string.

Here is the first one I found:
http://www.experts-exchange.com/Database/Oracle/PL_SQL/Q_23866171.html

SELECT UTL_URL.UNESCAPE('this%2Bis%2Ba+comment') FROM DUAL;

0
 
COMPSUPPAuthor Commented:
Thanks, I think this will work if I get our SQL guru here to get it to run on a whole table.
0

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now