vishyn
asked on
Character set encoding issues, Inserting special characters from Word into the Sybase db via hibernate
I am uploading a XML file which has some special quotes in one of the element like (quotes from MS Word, some other characters cut and paste from Word ---- like this text ABCs purchase of 11, As a result of this payment, the facilitys ownership. )
Please look at the apostrophe after ABC and facility.
create table .ABC"(
"Id" numeric(9) identity not null,
"OrgId" numeric(9) null,
"Notes" text null) Of course it has FK constraints etc.
Error in text is below also attached is the screenshot.
select @@identity]; SQL state [ZZZZZ]; error code [2402]; Error converting characters into server's character set. Some character(s) could not be converted. ; nested exception is com.sybase.jdbc3.jdbc.SybS QLExceptio n: Error converting characters into server's character set. Some character(s) could not be converted. at org.springframework.jdbc.s upport.Abs tractFallb ackSQLExce ptionTrans lator.tran slate
I even tried forcing the hibernate to use the character set conversion via the hibernate config file (by specifying UTF-8 and ISO8859_1) but it did not work. I am using jconn3.jar from sybase
jConnect (TM) for JDBC(TM)/6.05(Build 25828)/P/EBF13044/JDK14
<property name="dialect">org.hiberna te.dialect .SybaseDia lect</prop erty>
<property name="hibernate.show_sql"> true</prop erty>
<property name="hibernate.format_sql ">true</pr operty>
<property name="current_session_cont ext_class" >jta</prop erty>
<property name="cache.provider_class ">org.hibe rnate.cach e.NoCacheP rovider</p roperty>
<property name="hibernate.query.subs titutions" >true 1, false 0, yes Y, no N</property>
<property name="hibernate.bytecode.p rovider">c glib</prop erty>
<property name="hibernate.generate_s tatistics" >false</pr operty>
<property name="hibernate.jdbc.batch _size">10< /property>
<!-- Fix for Special Character errors in Sybase-->
<property name="hibernate.connection .useUnicod e">true</p roperty>
<property name="hibernate.connection .character Encoding"> UTF-8</pro perty>
<property name="hibernate.connection .charSet"> UTF-8</pro perty>
Sybase online manual says that in Chapter 10, When Sybbase Aaptive Server encounters a conversion error while sending data to the client, it replaces the bytes of the suspect characters with ASCII question marks (?).
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.ase_15.0.sag1/html/sag1/sag1476.htm
I want to avoid make a change on sybase server side to make it accept unicode characters, as it might effect other applications. Is there a way we can control this through hibernate or only through sybase. Can we control this behaviour via jconn3 sybase JDBC jar.
I am at loss of ideas here to make this insert successfully in to the database. Please let me know of the solution
hibernate-error.JPG
MS-Word-apostrophe-in-the-text.txt
Please look at the apostrophe after ABC and facility.
create table .ABC"(
"Id" numeric(9) identity not null,
"OrgId" numeric(9) null,
"Notes" text null) Of course it has FK constraints etc.
Error in text is below also attached is the screenshot.
select @@identity]; SQL state [ZZZZZ]; error code [2402]; Error converting characters into server's character set. Some character(s) could not be converted. ; nested exception is com.sybase.jdbc3.jdbc.SybS
I even tried forcing the hibernate to use the character set conversion via the hibernate config file (by specifying UTF-8 and ISO8859_1) but it did not work. I am using jconn3.jar from sybase
jConnect (TM) for JDBC(TM)/6.05(Build 25828)/P/EBF13044/JDK14
<property name="dialect">org.hiberna
<property name="hibernate.show_sql">
<property name="hibernate.format_sql
<property name="current_session_cont
<property name="cache.provider_class
<property name="hibernate.query.subs
<property name="hibernate.bytecode.p
<property name="hibernate.generate_s
<property name="hibernate.jdbc.batch
<!-- Fix for Special Character errors in Sybase-->
<property name="hibernate.connection
<property name="hibernate.connection
<property name="hibernate.connection
Sybase online manual says that in Chapter 10, When Sybbase Aaptive Server encounters a conversion error while sending data to the client, it replaces the bytes of the suspect characters with ASCII question marks (?).
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.ase_15.0.sag1/html/sag1/sag1476.htm
I want to avoid make a change on sybase server side to make it accept unicode characters, as it might effect other applications. Is there a way we can control this through hibernate or only through sybase. Can we control this behaviour via jconn3 sybase JDBC jar.
I am at loss of ideas here to make this insert successfully in to the database. Please let me know of the solution
hibernate-error.JPG
MS-Word-apostrophe-in-the-text.txt
ASKER
Hi Joe, Please see my server configuration below. Based on the information below. I want to be able to support frequently used quotes and double quotes etc that come from MS Word documents. I am saving the information that comes from these MS word documents into XML which is then saved to the backend. Please let me know of your thoughts based on the server information below and what I am trying to do.
Parameter Name Default Memory Used Config Value Run Value Unit Type
-------------- ------- ----------- ------------ --------- ---- ----
default character set id 4 0 3 3 id static
default language id 0 0 0 0 id dynamic
default sortorder id 50 0 50 50 id static
disable character set conversi 0 0 0 0 switch static
enable unicode conversions 0 0 0 0 switch dynamic
Parameter Name Default Memory Used Config Value Run Value Unit Type
-------------- ------- ----------- ------------ --------- ---- ----
default unicode sortorder binary 0 binary binary name static
enable surrogate processing 1 0 1 1 switch dynamic
enable unicode conversions 0 0 0 0 switch dynamic
enable unicode normalization 1 0 1 1 switch dynamic
Parameter Name Default Memory Used Config Value Run Value Unit Type
-------------- ------- ----------- ------------ --------- ---- ----
default character set id 4 0 3 3 id static
default language id 0 0 0 0 id dynamic
default sortorder id 50 0 50 50 id static
disable character set conversi 0 0 0 0 switch static
enable unicode conversions 0 0 0 0 switch dynamic
Parameter Name Default Memory Used Config Value Run Value Unit Type
-------------- ------- ----------- ------------ --------- ---- ----
default unicode sortorder binary 0 binary binary name static
enable surrogate processing 1 0 1 1 switch dynamic
enable unicode conversions 0 0 0 0 switch dynamic
enable unicode normalization 1 0 1 1 switch dynamic
ASKER
sqlloc and sqllocrecs might not be helpful in my case as this is a J2EE application which is being inserted via web logic server through hibernate as persistence layer.
Using one version of jconn3.jar Connect (TM) for JDBC(TM)/6.0(Build 25570) /P/EBF12436/JDK14/ is working fine and inserting ? as per the sybase manual. When Adaptive Server encounters a conversion error while sending data to the client, it replaces the bytes of the suspect characters with ASCII question marks (?).
In test and production, the jconn3.jar is upgraded to jConnect (TM) for JDBC(TM)/6.05(Build 25828)/P/EBF13044/JDK14/. As you see there is a difference in the jar files and this jar is causing the problems.
Hope this helps in understanding the problem
Using one version of jconn3.jar Connect (TM) for JDBC(TM)/6.0(Build 25570) /P/EBF12436/JDK14/ is working fine and inserting ? as per the sybase manual. When Adaptive Server encounters a conversion error while sending data to the client, it replaces the bytes of the suspect characters with ASCII question marks (?).
In test and production, the jconn3.jar is upgraded to jConnect (TM) for JDBC(TM)/6.05(Build 25828)/P/EBF13044/JDK14/. As you see there is a difference in the jar files and this jar is causing the problems.
Hope this helps in understanding the problem
sqlloc and sqllocres are server-side tools used to setup character sets within ASE - they're not relevant to your client or middle tier, so don't worry about J2EE and Web Logic.
I don't recall the character set codes off the top of my head, can you also run "sp_helpsort" in your ASE and post the results here?
Basically I'm just wanting to check which character set you have as your server default.
I am surprised that the .jar file is making a difference to you, but that suggests there's something wrong at the jConnect level rather than the ASE. Do you have a more recent jConnect EBF you can test, maybe that reverts to the older behaviour again? Failing that, maybe it's time for a Sybase Tech Support case against the jConnect...
I don't recall the character set codes off the top of my head, can you also run "sp_helpsort" in your ASE and post the results here?
Basically I'm just wanting to check which character set you have as your server default.
I am surprised that the .jar file is making a difference to you, but that suggests there's something wrong at the jConnect level rather than the ASE. Do you have a more recent jConnect EBF you can test, maybe that reverts to the older behaviour again? Failing that, maybe it's time for a Sybase Tech Support case against the jConnect...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Aha, so it was a jConnect change.
And yes, your solution sounds much cleaner than messing around with switching everything to unicode, UTF8, forcing unicode normalisations, etc!
And yes, your solution sounds much cleaner than messing around with switching everything to unicode, UTF8, forcing unicode normalisations, etc!
To answer this :-
Use Sybase Server wide Setting or Session Level Setting :-
“disable character set conversi” == True”
Or set char_convert off
By using the parameter we have able to convert the special ASCII characters back to the correct values.
In the database , data is still stored as a garbled value, but by turning this conversion off, we are able view the correct value of this garbled data in the application
Use Sybase Server wide Setting or Session Level Setting :-
“disable character set conversi” == True”
Or set char_convert off
By using the parameter we have able to convert the special ASCII characters back to the correct values.
In the database , data is still stored as a garbled value, but by turning this conversion off, we are able view the correct value of this garbled data in the application
The main thing is to pick what ASE's default character set is, because you have conversion issues anything you use something other than that.
What platform are you running ASE on? You use an external tool to add character sets. On a UNIX/Linux platform it's $SYBASE/$SYBASE_ASE/bin/sq