Link to home
Start Free TrialLog in
Avatar of apra-amcos
apra-amcos

asked on

Why does Unicode data corrupt when inserted using the Sybase open client with MicroFocus COBOL?

When I insert Unicode data using the Sybase BCP utility there is no problem. When I insert Unicode data from a COBOL program using embedded SQL it becomes corrupted. Same result in Windows and Solaris (UNIX). I have set the "LANG" environment variable to "us_english.utf8" and issued "set char_convert off", to no avail.

The COBOL isn't a problem; I can manipulate the data, I just can't store it properly in Sybase "univarchar" columns. Using bcp with "-Y -Jutf8" works fine.
Avatar of Gary Patterson, CISSP
Gary Patterson, CISSP
Flag of United States of America image

Can you please show us the code that is failing?
Avatar of apra-amcos
apra-amcos

ASKER

The code doesn't actually fail. the data is stored as rubbish.

I've attached the program source, and the column is question (col6) is a univarchar type. This program is as close as I can get to copying what I imagine that BCP does.
"Unexpected output" is certainly in the "fail" category when I'm testing software.

There is no attachment.
Sorry about the attachment. It should now be OK.
ucload2.txt
ASKER CERTIFIED SOLUTION
Avatar of Gary Patterson, CISSP
Gary Patterson, CISSP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks, Gary, your insights steered us in the right direction. After setting the server default character set to UTF8 (it was ISO_1) and specifying "set char_convert utf8" in the client-side application, it all worked exactly as per the BCP. There was no need to cast or convert the text into univarchar, since this is now the server default.

Thanks again, much appreciated!
Glad you got it worked out.  

Based on the documentation, I wouldn't have necessarily expected that to work.  

Based on your results, apparently it can do UTF-8 to UTF-16 conversion automatically - it just can't do automatic conversion to UTF-16 when the client is using a non-Unicode encoding.