Processing Euro from XML document sent via email

Hi all,

I have a procedure that loads inbound emails into a table called AC_INBOUND_MESSAGES. The procedure takes in the sender, receipient, message body(all VARCHAR2)  and an attachment(CLOB). My problem is that whenever there is a euro symbol in the xml attachment, the message fails to go into the AC_INBOUND_MESSAGES table. Various testing and debugging has isolated the problem line (as shown below). Does anyone know why this is happening and how i can resolve it.

Also i have tried all the xml approaches of changing the encoding, replacing the euro symbol with it encoded equivalents, etc... but none work. I'd really appreciate any feedback as this problem has been driving me crazy for a while now.

INSERT INTO AC_INBOUND_MESSAGES (
               MESSAGE_NO,
             SUBJECT,
             SENDER,
             RECIPIENT,
             BODY,
             ATTACHMENT,
             STATUS,
             CREATED_BY,
             CREATED_DATE )
 VALUES (
             l_message_no,
            p_in_subject,
            p_in_sender,
            p_in_recepient,
            p_in_body,
            XMLTYPE(p_in_attachment),   <---- Problem Line when Euro symbol in xml doc.
            'U',
            USER,
            SYSDATE );


Thanks in advance.
LVL 1
g00334120Asked:
Who is Participating?
 
NievergeltConnect With a Mentor Senior SW DevCommented:
Think about this:

The start of <?xml version="1.0" ?> in UTF16 little-endian unicode (i.e. in your file) looks like this:

BOM  <      ?       x      m      l       blank   v      e                             (BOM ist the "byte order marker" that declares that the following is UTF16 little-endian)
FFFE 3C00 3F00 7800 6D00 6C00 2000   7600 6500

If you take those byte values, strip the most significant bit (make it 7-bit), it becomes this:
7F7E 3C00 3F00 7800 6D00 6C00 2000   7600 6500

Translate the byte values to decimal:
127,126 60,0 63,0 120,0 109,0 108,0 32,0 118,0 101,0

Now compare that to what you got from DUMP.
As you can see, this is what you get.

Because your database character set is WE8MSWIN1252, this is the character set of CLOB and therefore the bytes you stuffed into the SP parameter are interpreted as characters
DEL ~ < NUL ? NUL x NUL m NUL
and so on.

I assumed that using setCharacterStream() would solve the problem of character set conversion.
Because your NLS_LANG setting say something like AMERICA_AMERICAN.WE8MSWIN1252, the API sees no reason for conversion and thus you have your problem.
For this reason I see two ways to resolve this problem:
a) Convert the attachment (the file) from Unicode to WE8MSWIN1252.
b) Change the CLOB parameter to NCLOB. This should at least work as long as you have no characters in your unicode document that cannnot be converted to WE8MSWIN1252 (not sure about details here).
0
 
NievergeltSenior SW DevCommented:
Some questions:
- What do you mean by "the message fails to go into the AC_INBOUND_MESSAGES table"? Is the record not inserted?
- What is the character set of your database?
0
 
g00334120Author Commented:
Yes, you're correct, the record is not inserted into the database.

The character set of the database is WE8MSWIN1252

Thanks.
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
NievergeltSenior SW DevCommented:
Do I understand you correctly that the insert fails without an error message?
What is the values of your NLS_NCHAR_CONV_EXCP initialisation parameter?
0
 
g00334120Author Commented:
I get the following error:

Error encountered while inserting message, error is ORA-31011: XML parsing failed

and the NLS_NCHAR_CONV_EXCP initialisation parameter is set to FALSE.

Thanks.
0
 
NievergeltSenior SW DevCommented:
OK.
Please post an example of an XML document that causes this error.
0
 
g00334120Author Commented:
<?xml version="1.0" ?>
<ABASYNC_BUSINESS_OBJECT TYPE="SITE_ORDER">
      <SITE_ORDER>
<PROJECT_CODE>A95</PROJECT_CODE>
<SITE_ORDER_TYPE>S</SITE_ORDER_TYPE>
<SITE_ORDER_NO>A9515458S</SITE_ORDER_NO>
<SITE_ORDER_DATE>05/05/2005</SITE_ORDER_DATE>
<DELIVERY_DATE>06/05/2005</DELIVERY_DATE>
<SUPPLIER_CODE>ACEF01</SUPPLIER_CODE>
<REF>.</REF>
      <SITE_ORDER_DETAILS>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>1</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA27</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M6x65(Hilti HSA-K)</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
<ORDER_QUANTITY>23</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>2</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA37</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M12x100(Hilti HSA-K)</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
<ORDER_QUANTITY>232</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>3</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>BITSHSSS05</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>5mm HSS Drill Bit @ €50 each</MATERIAL_DESCRIPTION>
<UNIT>Each</UNIT>
<ORDER_QUANTITY>232</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      </SITE_ORDER_DETAILS>
      </SITE_ORDER>
</ABASYNC_BUSINESS_OBJECT>

If i remove the euro from the material description line, then the XML document goes into the table without a problem.

Thanks.
0
 
NievergeltSenior SW DevCommented:
Try to use the following text declaration:

<?xml version="1.0" encoding="windows-1252" ?>

as this one includes the Euro symbol.
0
 
g00334120Author Commented:
No, still getting the following error:


Error encountered while inserting message, error is ORA-31011: XML parsing failed
ORA-19202: Error
occurred in XML processing
LPX-00216: invalid character 0 (0x0)
Error at line 1
ORA-06512: at
"SYS.XMLTYPE", line 0
ORA-06512: at line 1


Thanks
0
 
NievergeltSenior SW DevCommented:
Strange, looks like Oracle does not know the windows-1252 encoding, I'll look into that some more.
By the way, what version of Oracle are you using?

Well, what should work, although I do not know if your happy with it:
Replace the € with either &#8364;  or &#x20ac; (the unicode code point of the Euro)
0
 
NievergeltSenior SW DevCommented:
Wait a second, now that I am looking at the Oracle message, I am wondering.

Could it be that the XML document effectively is in UTF-8 and your NLS_LANG setting on the client is not AL32UTF8?
0
 
g00334120Author Commented:
Sorry, when you mean client, do you mean the machine the database is on or a client who connects into it?
0
 
g00334120Author Commented:
Ok, here's a strange one for you, when i substituted code points instead of the eurosymbol, the same result still happens. Same error occurs. Weird.


Thanks.
0
 
g00334120Author Commented:
And its 9i. Sorry about that.
0
 
NievergeltSenior SW DevCommented:
Sorry, I was offline.

And you have no need to be sorry. I am just playing around with 10g and there are some strange and misterious things happening there even after applying the patchset to 10.1.0.4.

Whith the client, I meant the machine, where the insert is issued from.
But could you please also post the database character set.

> Ok, here's a strange one for you, when i substituted code points instead of the eurosymbol, the same result still happens. Same error occurs. Weird.
Actually no, this not strange, as it confirms that the problem is not really the Euro symbol.
I do not know how much you know about Unicode, but this sounds like UTF-16 is treated like a single-byte character set.
0
 
g00334120Author Commented:
nls_language  AMERICAN
nls_territory    AMERICA

NLS_NCHAR_CHARACTERSET  AL16UTF16

NLS_CHARACTERSET  WE8MSWIN1252

Thats the DB settings.


Thanks.
0
 
NievergeltSenior SW DevCommented:
Where are you calling the SP from?

On Ask Tom, Sean Dillon, Oracle's XML technologist mentions: "earlier versions of XML DB had some issues w/ character set support,"
I found many problems with 9i and XMLType but none that matches yours.
Use something like this to narrow down the problem:

Create a test table:
CREATE TABLE TEMP_TEST_XML_DOC (xml_text VARCHAR2(4048));

Add to your SP :

  ...
  xmlString VARCHAR2(4000);
DECLARE
  ...
  xmlString := RAWTOHEX(p_in_attachment);
  EXECUTE IMMEDIATE 'INSERT INTO TEMP_TEST_XML_DOC VALUES(:x)' USING xmlString;

then look at what you get in that table.
0
 
NievergeltSenior SW DevCommented:
Created a test table:
CREATE TABLE XMLTEST (TITLE VARCHAR2(100) NOT NULL,
                                       XML_DOC SYS.XMLTYPE)  

Created the following test procedure:

PROCEDURE INSERT_XML1 as
 nc CLOB;
begin
  nc := CHR(0) || CHR(20);
  INSERT INTO XMLTEST
  VALUES('Title1', XMLTYPE(nc));
end;

This will produce the same error that you get:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00216: invalid character 0 (0x0)
Error at line 1
ORA-06512: at "SYS.XMLTYPE", line 248
ORA-06512: at line 1
ORA-06512: at "CHRISTOPH.INSERT_XML1", line 48
ORA-06512: at line 2
0
 
NievergeltSenior SW DevCommented:
Sorry, I overlooked p_in_attachment being CLOB; you will have to use

  xmlString := RAWTOHEX(TO_CHAR(p_in_attachment))

in your SP to make it work.

Hope this helps   Christoph
0
 
g00334120Author Commented:
Hi, sorry i didn't get back to you over the weekend but i got swamped with things and didn't get near a machine.

I tried the above suggestion and was presented with the following error when i ran the procedure. I was using the line xmlString := RAWTOHEX(TO_CHAR(p_in_attachment))


Error encountered WHILE INSERTING message, error IS ORA-06502: PL/SQL: numeric or value error: hex to raw conversion error


Thanks.
0
 
NievergeltSenior SW DevCommented:
Hi

Please post
- the SP you used for testing (or at least all relevant parts)
- the portion of your code that calls the SP
- the table description of TEMP_TEST_XML_DOC

0
 
g00334120Author Commented:
relevnat portion of the sp
 l_message_no AC_INBOUND_MESSAGES.message_no%TYPE;
 xmlString VARCHAR2(4000);

BEGIN
  Ac_System_Errors_Pkg.SET_PROG_POSITION_PR('Get next value for sequence AC_INBOUND_MESSAGES_SEQ');

  dbms_output.put_Line(p_in_subject||'*'||p_in_sender||'*'||p_in_recepient||'*'||p_in_body);
 
 
  l_message_no :=  Ac_Sequences_Pkg.get_sequence_next_value_fn('AC_INBOUND_MESSAGES_SEQ');
 
  --Ac_System_Errors_Pkg.SET_PROG_POSITION_PR('Insert message no '||l_message_no||' into the ac_inbound_messages table.');
  --dbms_output.put_Line('write file');
  --AC_GENERIC_PKG.WRITE_CLOB_TO_FILE_PR('clob.xml',p_in_attachment);
  Ac_System_Errors_Pkg.SET_PROG_POSITION_PR('Insert new inbound message record.');
  dbms_output.put_Line('back to insert...');
  xmlString := RAWTOHEX(TO_CHAR(p_in_attachment));
  EXECUTE IMMEDIATE 'INSERT INTO TEMP_TEST_XML_DOC VALUES(:x)' USING xmlString;

  INSERT INTO AC_INBOUND_MESSAGES (
               MESSAGE_NO,
             SUBJECT,
             SENDER,
             RECIPIENT,
             BODY,
             ATTACHMENT,
             STATUS,
             CREATED_BY,
             CREATED_DATE )
 VALUES (
             l_message_no,
            p_in_subject,
            p_in_sender,
            p_in_recepient,
            p_in_body,
            XMLTYPE(p_in_attachment),
            'U',
            USER,
            SYSDATE );


The function is called from a stored java program.
insProcName = (CallableStatement) conn.prepareCall( "begin ac_inbound_messages_pkg.create_new_message_pr(?,?,?,?,?); end;");

The ? are populated correctly, i've checked that thoroughly.


dec of TEMP_TEST_XML_DOC
Name                                                  Null?    Type
----------------------------------------------------- -------- --------------
XML_TEXT                                                       VARCHAR2(4000)


Also I tried just inserting a euro symbol into the test table using a simple insert statement and when i selected it back, it came back as the following symbol.

¿


Thanks.
0
 
NievergeltSenior SW DevCommented:
> Also I tried just inserting a euro symbol into the test table using a simple insert statement and when i selected it back, it came back as the following symbol.

> Â¿

Yes, I think I see the problem now.
Although you did not show the Java code to set the parameters, I assume that a UTF-16 value is being passed as CLOB.

As for the xmlString := RAWTOHEX(TO_CHAR(p_in_attachment)), I am sorry, did not try it myself until now.

  SELECT DUMP(SUBSTR(TO_CHAR(p_in_attachment), 1, 100)) INTO xmlString;

works. I tested it ;-)
0
 
g00334120Author Commented:
Sorry, here's the java code to fill the statement.

System.out.println("ReceiveMail.Receive >> Create temp file "+ outputDir + "\\" + tempFileName);
                                    locName = "Open temp output file.";
                                    File tempFile  = new File(outputDir + "\\" + tempFileName);
      

                                    FileOutputStream oStream = new FileOutputStream(tempFile);
                                    int b;
                                    while ((b=is.read())!= -1)
                                    {
                                        System.out.print((char)b);
                                        oStream.write(b);
                                    }
                                    oStream.close();


                                    InputStream is1 = new BufferedInputStream(new FileInputStream(tempFile));

                                    insProcName.setString(1,subject);      
                                    insProcName.setString(2,from);      
                                    insProcName.setString(3,to);      
                                    insProcName.setString(4,body);      
                                    insProcName.setAsciiStream(5,is1,(int)is1.available());
                          
                      
                                    try
                                    {
                                        locName = "Call DB procedure to load file.";
                                        insProcName.execute();
                                        message.setFlag(Flags.Flag.DELETED, true);
                                        locName = "Delete temp file.";
                                        tempFile.delete();
                                    }

Thanks
0
 
NievergeltSenior SW DevCommented:
Yes, but how is body constructed?
Do you now get something from DUMP?
0
 
g00334120Author Commented:
Hi, the body of the message is set to a full-stop. only the attachment is read in from a file. The process is that,
1. mail is received and retrieved by the java program.
2. a temporary file is created for the attachment on the server. ( its just a copy of the XML document. Its in unicode format)
3. the temporary file is then read back by the code above and transferred into the database. (as long as there is no euro in the xml! ;)



Here is the results of the dump call:
Typ=1 Len=100 CharacterSet=WE8MSWIN1252: 127,126,60,0,63,0,120,0,109,0,108,0,32,0,118,0,101,0,114,0,115,0,105,0,111,0,110,0,61,0,34,0,49,0,46,0,48,0,34,0,32,0,63,0,62,0,13,0,10,0,60,0,65,0,66,0,65,0,83,0,89,0,78,0,67,0,95,0,66,0,85,0,83,0,73,0,78,0,69,0,83,0,83,0,95,0,79,0,66,0,74,0,69,0,67,0,84,0,32,0

and here is the corresponding file:
<?xml version="1.0" ?>
<ABASYNC_BUSINESS_OBJECT TYPE="SITE_ORDER">
      <SITE_ORDER>
<PROJECT_CODE>A95</PROJECT_CODE>
<SITE_ORDER_TYPE>S</SITE_ORDER_TYPE>
<SITE_ORDER_NO>A9515458S</SITE_ORDER_NO>
<SITE_ORDER_DATE>05/05/2005</SITE_ORDER_DATE>
<DELIVERY_DATE>06/05/2005</DELIVERY_DATE>
<SUPPLIER_CODE>ACEF01</SUPPLIER_CODE>
<REF>.</REF>
      <SITE_ORDER_DETAILS>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>1</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA27</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M6x65(Hilti HSA-K) @ €50</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
....
etc...

Thanks.
0
 
NievergeltSenior SW DevCommented:
Ok, I am little bit out of my depth with Java, so I overlooked the setAsciiStream() part.
I would suggest that you use getCharacterStream().
0
 
g00334120Author Commented:
I'll give that a try and get back to you.

Thanks.
0
 
g00334120Author Commented:
I tried replacing the setAsciiStream() with the setCharacterStream(). No joy. I'm still getting the same xml parsing error.

Thanks.
0
 
NievergeltSenior SW DevCommented:
Please post the DUMP output that you get with setCharacterStream.
0
 
g00334120Author Commented:
Here's the results of the dump with the SetCharacterStream()

Typ=1 Len=100 CharacterSet=WE8MSWIN1252: 255,254,60,0,63,0,120,0,109,0,108,0,32,0,118,0,101,0,114,0,115,0,105,0,111,0,110,0,61,0,34,0,49,0,46,0,48,0,34,0,32,0,63,0,62,0,13,0,10,0,60,0,65,0,66,0,65,0,83,0,89,0,78,0,67,0,95,0,66,0,85,0,83,0,73,0,78,0,69,0,83,0,83,0,95,0,79,0,66,0,74,0,69,0,67,0,84,0,32,0

and here is the corresponding file:
<?xml version="1.0" ?>
<ABASYNC_BUSINESS_OBJECT TYPE="SITE_ORDER">
      <SITE_ORDER>
<PROJECT_CODE>A95</PROJECT_CODE>
<SITE_ORDER_TYPE>S</SITE_ORDER_TYPE>
<SITE_ORDER_NO>A9515458S</SITE_ORDER_NO>
<SITE_ORDER_DATE>05/05/2005</SITE_ORDER_DATE>
<DELIVERY_DATE>06/05/2005</DELIVERY_DATE>
<SUPPLIER_CODE>ACEF01</SUPPLIER_CODE>
<REF>.</REF>
      <SITE_ORDER_DETAILS>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>1</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA27</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M6x65(Hilti HSA-K) @ 50</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
<ORDER_QUANTITY>23</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>2</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA37</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M12x100(Hilti HSA-K)</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
....
etc.


Thanks.
0
 
NievergeltSenior SW DevCommented:
Hi

I would very much appreciate, if you could post a short comment on what worked for you in the end.

Share and Enjoy  Christoph
0
 
g00334120Author Commented:
Hi,

It was changing the CLOB to an NCLOB that did it for me,
our database doesn't use many exotic characters, with the euro probably being the only one.

Anyways if the problem surfaces again, then I have this page as an extremely handy reference.

Thanks a million.
Emmet.
0
 
NievergeltSenior SW DevCommented:
Hi Emmet

Thank you for your reply.

Glad to have been of assistance.
Christoph
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.