Link to home
Start Free TrialLog in
Avatar of Theo Kouwenhoven
Theo KouwenhovenFlag for Netherlands

asked on

Translation problem EBCDIC ASCII UTF8

Hi Experts,

While creating (and transferring) a XML message from IBMi (via IFS) to my PC, i lost a lot of special characters.
I changed the program that an IFS-file is created CCSID(1208) and filled from my RPG-program with some test characters. Them are translated the right way, except for the Euro sign

Source: €
Translation to: ¤

How do I solve that?
 
Avatar of skullnobrains
skullnobrains

you should declare the encoding before transferring or use an editor that properly detect encodings such as jedit, or convert to whatever encoding you want with iconv or a similar command
Hi Theo, as you probably know, the encoding of an IFS file is specified as an attribute of the file (view with WRKLNK, option 8).  Please verify the CCSID attribute in the IFS and confirm it is 1208.  But even if the file has a CCSID 1208 attribute, it is possible to put data into the file that is not properly encoded.

Use EDTF on the IBM i to open the file on the IFS and verify that it looks right.  If it has a CCSID 1208 attribute, and it looks wrong in EDTF, then the actual encoding is probably wrong, and you'll need to change the way you populate the file.

You mention an RPG program.  Is this just a test program, or is an RPG program used to create and populate the file somehow?  RPG is natively EBCDIC, and getting it to produce CCSID 1208 output to the IFS takes specific steps.

So:

1) How does the IFS file get created, specifically what command or API, or is it received somehow).
2) Are you writing directly using IFS write() API in RPG?  
3) Can you post a sample output file?  
4) Not sure what PC editor you are using, but suggest you try Notepad++ - it will inspect the file and determine encoding instead of assuming.  If you can't post a sample file, perhaps you can open in N++ and check the encoding it determines.
5) How, exactly are you transferring the file?  Netserver file share, or Navigator?



Avatar of Theo Kouwenhoven

ASKER

Hi Gary,

The IFS actions are handled with the Scott Api "IFSIO_H".
SO here the answers on your questions:

  1. How does the IFS file get created, specifically what command or API, or is it received somehow).
    fd = open(%trim($file_path)                                  
              : O_RDWR + O_CREAT + O_TRUNC + O_CCSID + O_TEXTDATA
              : S_IRWXU + S_IRWXG + S_IRWXO                      
              : 1208);                                          

  2. Are you writing directly using IFS write() API in RPG?
    write(fd: %addr(Xml$:*DATA): %len(%trim(XmlLine)));
        (note xml$ is declared as char(1024) CCSID(*UTF8)  Varying)

  3. Can you post a sample output file?
    Sure see attached (in next comment)
    This is the IFS content User generated image 
     
  4. Not sure what PC editor you are using, but suggest you try Notepad++ - it will inspect the file and determine encoding instead of assuming.  If you can't post a sample file, perhaps you can open in N++ and check the encoding it determines.
    Notepad++ of course :-)
     
  5. How, exactly are you transferring the file?  Netserver file share, or Navigator? 
    For now with FTP (Bin), but has to be added as Base64 to a webservice call

Attachment :-)
MyExample.rar
your input does not contain the correct UTF8 code for €
 cat -v /tmp/MyExample.xml
<?xml version="1.0" encoding="UTF-8"?>^M
<Test>M-BM-0 M-CM-^X ~ ` ] @ # $ [ M-BM-$ ' " & * </Test>^M

as demonstrated here

$ echo '° Ø ~ ` ] @ # $ [ ¤ '"'"' " & *' | cat -v
M-BM-0 M-CM-^X ~ ` ] @ # $ [ M-BM-$ ' " & *

$ echo '° Ø ~ ` ] @ # $ [ € '"'"' " & *' | cat -v
M-BM-0 M-CM-^X ~ ` ] @ # $ [ M-bM-^BM-, ' " & *

i would assume M-bM-^BM-, is actually EBCDIC encoded €

$ od -c /tmp/MyExample.xml
0000000   <   ?   x   m   l       v   e   r   s   i   o   n   =   "   1
0000020   .   0   "       e   n   c   o   d   i   n   g   =   "   U   T
0000040   F   -   8   "   ?   >  \r  \n   <   T   e   s   t   > 302 260
0000060     303 230       ~       `       ]       @       #       $    
0000100   [     302 244       '       "       &       *       <   /   T
0000120   e   s   t   >  \r  \n
0000126
$ echo '° Ø ~ ` ] @ # $ [ € '"'"' " & *' | od -c
0000000 302 260     303 230       ~       `       ]       @       #    
0000020   $       [     342 202 254       '       "       &       *  \n
0000040


Hi skullnobrains,

Thanks for finding my problem, but I need a solution,
the € sign comes from a software package and I can't change that.
On the IBMi i only have 1 Euro sign, so I can't write an other Euro character to the IFS.

I did upload an € also CCSID(1208.) and the result is a 5B character
User generated imageDownload the 5B give me bach the €.

Bud I don't want to do some translation for every line I read from a source.


the € sign comes from a software package and I can't change that.
Bud I don't want to do some translation for every line I read from a source. 
i'd try to modify the environment LC_* and LANG vars before running the program

but unless you are lucky enough the above works, i do not believe those 2 requirements can both be fulfilled

but has to be added as Base64 to a webservice call
if you really cannot solve the issue at the source, maybe setting up that webservice can solve both problems at once by performing the required transformation on the fly which should be trivial to add ?


Hi skullnobrains,

If the text is retrieved from the DB with a Webservice RPGLE, there seems to be no problem.
(Webservice is configured on the Apache server)


ASKER CERTIFIED SOLUTION
Avatar of Gary Patterson, CISSP
Gary Patterson, CISSP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial