• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 10959
  • Last Modified:

Character set encoding issues, Inserting special characters from Word into the Sybase db via hibernate

I am uploading a XML file which has some special quotes  in one of the element like (quotes from MS Word, some other characters cut and paste from Word ---- like this text ABCs purchase of 11, As a result of this payment, the facilitys ownership. )
Please look at the apostrophe after ABC and facility.

create table .ABC"(
        "Id" numeric(9) identity not null,
       "OrgId" numeric(9) null,
       "Notes" text null) Of course it has FK constraints etc.
Error in text is below also attached is the screenshot.
select @@identity]; SQL state [ZZZZZ]; error code [2402]; Error converting characters into server's character set. Some character(s) could not be converted. ; nested exception is com.sybase.jdbc3.jdbc.SybSQLException: Error converting characters into server's character set. Some character(s) could not be converted. at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate

I even tried forcing the hibernate to use the character set conversion via the hibernate config file (by specifying UTF-8 and ISO8859_1) but it did not work. I am using jconn3.jar from sybase
 jConnect (TM) for JDBC(TM)/6.05(Build 25828)/P/EBF13044/JDK14

<property name="dialect">org.hibernate.dialect.SybaseDialect</property>
      <property name="hibernate.show_sql">true</property>
      <property name="hibernate.format_sql">true</property>
      <property name="current_session_context_class">jta</property>
      <property name="cache.provider_class">org.hibernate.cache.NoCacheProvider</property>
      <property name="hibernate.query.substitutions">true 1, false 0, yes Y, no N</property>
      <property name="hibernate.bytecode.provider">cglib</property>
      <property name="hibernate.generate_statistics">false</property>
      <property name="hibernate.jdbc.batch_size">10</property>
        <!-- Fix for Special Character errors in Sybase-->
        <property name="hibernate.connection.useUnicode">true</property>
      <property name="hibernate.connection.characterEncoding">UTF-8</property>
      <property name="hibernate.connection.charSet">UTF-8</property>

Sybase online manual says that in Chapter 10,  When Sybbase Aaptive Server encounters a conversion error while sending data to the client, it replaces the bytes of the suspect characters with ASCII question marks (?).

 I want to avoid make a change on sybase server  side to make it accept unicode characters, as it might effect other applications. Is there a way we can control this through hibernate or only through sybase. Can we control this behaviour via jconn3 sybase JDBC jar.

I am at loss of ideas here to make this insert successfully in to the database.  Please let me know of the solution
  • 3
  • 3
1 Solution
Joe WoodhousePrincipal ConsultantCommented:
Well, what character sets are you using, and what do you want Sybase ASE to be able to support? It's pretty easy adding support for more character sets.

The main thing is to pick what ASE's default character set is, because you have conversion issues anything you use something other than that.

What platform are you running ASE on? You use an external tool to add character sets. On a UNIX/Linux platform it's $SYBASE/$SYBASE_ASE/bin/sqlloc. (Doing that from memory, may not be exactly correct, but the tool is definitely "sqlloc".) That's a GUI tool and needs an X environment, if you don't have one you'll have to use a resource file and "sqllocres" (reads inputs from the resource file; you have a sample one in $SYBASE/$SYBASE_ASE/init/sample_resource_files).
vishynAuthor Commented:
Hi Joe, Please see my server configuration below. Based on the information below. I want to be able to support frequently used quotes and double quotes etc that come from MS Word documents. I am saving the information that comes from these MS word documents into XML which is then saved to the backend. Please let me know of your thoughts based on the server information below and what I am trying to do.

Parameter Name                 Default     Memory Used Config Value Run Value   Unit                 Type      
--------------                 -------     ----------- ------------ ---------   ----                 ----      
default character set id                 4           0           3            3 id                   static    
default language id                      0           0           0            0 id                   dynamic    
default sortorder id                    50           0          50           50 id                   static    
disable character set conversi           0           0           0            0 switch               static    
enable unicode conversions               0           0           0            0 switch               dynamic    

Parameter Name                 Default     Memory Used Config Value Run Value   Unit                 Type      
--------------                 -------     ----------- ------------ ---------   ----                 ----      
default unicode sortorder           binary           0      binary       binary name                 static    
enable surrogate processing              1           0           1            1 switch               dynamic    
enable unicode conversions               0           0           0            0 switch               dynamic    
enable unicode normalization             1           0           1            1 switch               dynamic    
vishynAuthor Commented:
sqlloc and sqllocrecs might not be helpful in my case as this is a J2EE application which is being inserted via web logic server through hibernate as persistence layer.

Using one version of  jconn3.jar Connect (TM) for JDBC(TM)/6.0(Build 25570) /P/EBF12436/JDK14/ is working fine and inserting ? as per the sybase manual. When Adaptive Server encounters a conversion error while sending data to the client, it replaces the bytes of the suspect characters with ASCII question marks (?).

In test and production, the jconn3.jar is upgraded to  jConnect (TM) for JDBC(TM)/6.05(Build 25828)/P/EBF13044/JDK14/. As you see there is a difference in the jar files and this jar is causing the problems.

Hope this helps in understanding the problem
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

Joe WoodhousePrincipal ConsultantCommented:
sqlloc and sqllocres are server-side tools used to setup character sets within ASE - they're not relevant to your client or middle tier, so don't worry about J2EE and Web Logic.

I don't recall the character set codes off the top of my head, can you also run "sp_helpsort" in your ASE and post the results here?

Basically I'm just wanting to check which character set you have as your server default.

I am surprised that the .jar file is making a difference to you, but that suggests there's something wrong at the jConnect level rather than the ASE. Do you have a more recent jConnect EBF you can test, maybe that reverts to the older behaviour again? Failing that, maybe it's time for a Sybase Tech Support case against the jConnect...
vishynAuthor Commented:
Actually after spedning a considerable amount of time tweaking sybase server properties, It is not possible according to sybase documentation itself. As there are so many other databases on that server, I finally wrote a piece of java code to replace smart quotes with regular quotes. Here is the info below r'ing the jconnect jar. Thanks for your time.

 * For jConnect versions starting with 6.05, if jConnect cannot successfully convert the user data to the negotiated charset, it will send unconverted unicode characters to the server, if the server supports the unicode characters (ASE 12.5 or higher), else it will throw an exception.
Joe WoodhousePrincipal ConsultantCommented:
Aha, so it was a jConnect change.

And yes, your solution sounds much cleaner than messing around with switching everything to unicode, UTF8, forcing unicode normalisations, etc!
vishal chadhaCommented:
To answer this :-

Use Sybase Server wide Setting or Session Level Setting :-
                        “disable character set conversi” == True”
                        Or set char_convert off
            By using the parameter we have able to convert the special ASCII characters back to the correct values.

            In the database , data is still stored as a garbled value, but by turning this conversion off, we are able view the correct value of this garbled data in the application
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now