Link to home
Start Free TrialLog in
Avatar of g00334120
g00334120

asked on

Processing Euro from XML document sent via email

Hi all,

I have a procedure that loads inbound emails into a table called AC_INBOUND_MESSAGES. The procedure takes in the sender, receipient, message body(all VARCHAR2)  and an attachment(CLOB). My problem is that whenever there is a euro symbol in the xml attachment, the message fails to go into the AC_INBOUND_MESSAGES table. Various testing and debugging has isolated the problem line (as shown below). Does anyone know why this is happening and how i can resolve it.

Also i have tried all the xml approaches of changing the encoding, replacing the euro symbol with it encoded equivalents, etc... but none work. I'd really appreciate any feedback as this problem has been driving me crazy for a while now.

INSERT INTO AC_INBOUND_MESSAGES (
               MESSAGE_NO,
             SUBJECT,
             SENDER,
             RECIPIENT,
             BODY,
             ATTACHMENT,
             STATUS,
             CREATED_BY,
             CREATED_DATE )
 VALUES (
             l_message_no,
            p_in_subject,
            p_in_sender,
            p_in_recepient,
            p_in_body,
            XMLTYPE(p_in_attachment),   <---- Problem Line when Euro symbol in xml doc.
            'U',
            USER,
            SYSDATE );


Thanks in advance.
Avatar of Nievergelt
Nievergelt
Flag of Switzerland image

Some questions:
- What do you mean by "the message fails to go into the AC_INBOUND_MESSAGES table"? Is the record not inserted?
- What is the character set of your database?
Avatar of g00334120
g00334120

ASKER

Yes, you're correct, the record is not inserted into the database.

The character set of the database is WE8MSWIN1252

Thanks.
Do I understand you correctly that the insert fails without an error message?
What is the values of your NLS_NCHAR_CONV_EXCP initialisation parameter?
I get the following error:

Error encountered while inserting message, error is ORA-31011: XML parsing failed

and the NLS_NCHAR_CONV_EXCP initialisation parameter is set to FALSE.

Thanks.
OK.
Please post an example of an XML document that causes this error.
<?xml version="1.0" ?>
<ABASYNC_BUSINESS_OBJECT TYPE="SITE_ORDER">
      <SITE_ORDER>
<PROJECT_CODE>A95</PROJECT_CODE>
<SITE_ORDER_TYPE>S</SITE_ORDER_TYPE>
<SITE_ORDER_NO>A9515458S</SITE_ORDER_NO>
<SITE_ORDER_DATE>05/05/2005</SITE_ORDER_DATE>
<DELIVERY_DATE>06/05/2005</DELIVERY_DATE>
<SUPPLIER_CODE>ACEF01</SUPPLIER_CODE>
<REF>.</REF>
      <SITE_ORDER_DETAILS>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>1</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA27</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M6x65(Hilti HSA-K)</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
<ORDER_QUANTITY>23</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>2</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA37</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M12x100(Hilti HSA-K)</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
<ORDER_QUANTITY>232</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>3</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>BITSHSSS05</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>5mm HSS Drill Bit @ €50 each</MATERIAL_DESCRIPTION>
<UNIT>Each</UNIT>
<ORDER_QUANTITY>232</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      </SITE_ORDER_DETAILS>
      </SITE_ORDER>
</ABASYNC_BUSINESS_OBJECT>

If i remove the euro from the material description line, then the XML document goes into the table without a problem.

Thanks.
Try to use the following text declaration:

<?xml version="1.0" encoding="windows-1252" ?>

as this one includes the Euro symbol.
No, still getting the following error:


Error encountered while inserting message, error is ORA-31011: XML parsing failed
ORA-19202: Error
occurred in XML processing
LPX-00216: invalid character 0 (0x0)
Error at line 1
ORA-06512: at
"SYS.XMLTYPE", line 0
ORA-06512: at line 1


Thanks
Strange, looks like Oracle does not know the windows-1252 encoding, I'll look into that some more.
By the way, what version of Oracle are you using?

Well, what should work, although I do not know if your happy with it:
Replace the € with either &#8364;  or &#x20ac; (the unicode code point of the Euro)
Wait a second, now that I am looking at the Oracle message, I am wondering.

Could it be that the XML document effectively is in UTF-8 and your NLS_LANG setting on the client is not AL32UTF8?
Sorry, when you mean client, do you mean the machine the database is on or a client who connects into it?
Ok, here's a strange one for you, when i substituted code points instead of the eurosymbol, the same result still happens. Same error occurs. Weird.


Thanks.
And its 9i. Sorry about that.
Sorry, I was offline.

And you have no need to be sorry. I am just playing around with 10g and there are some strange and misterious things happening there even after applying the patchset to 10.1.0.4.

Whith the client, I meant the machine, where the insert is issued from.
But could you please also post the database character set.

> Ok, here's a strange one for you, when i substituted code points instead of the eurosymbol, the same result still happens. Same error occurs. Weird.
Actually no, this not strange, as it confirms that the problem is not really the Euro symbol.
I do not know how much you know about Unicode, but this sounds like UTF-16 is treated like a single-byte character set.
nls_language  AMERICAN
nls_territory    AMERICA

NLS_NCHAR_CHARACTERSET  AL16UTF16

NLS_CHARACTERSET  WE8MSWIN1252

Thats the DB settings.


Thanks.
Where are you calling the SP from?

On Ask Tom, Sean Dillon, Oracle's XML technologist mentions: "earlier versions of XML DB had some issues w/ character set support,"
I found many problems with 9i and XMLType but none that matches yours.
Use something like this to narrow down the problem:

Create a test table:
CREATE TABLE TEMP_TEST_XML_DOC (xml_text VARCHAR2(4048));

Add to your SP :

  ...
  xmlString VARCHAR2(4000);
DECLARE
  ...
  xmlString := RAWTOHEX(p_in_attachment);
  EXECUTE IMMEDIATE 'INSERT INTO TEMP_TEST_XML_DOC VALUES(:x)' USING xmlString;

then look at what you get in that table.
Created a test table:
CREATE TABLE XMLTEST (TITLE VARCHAR2(100) NOT NULL,
                                       XML_DOC SYS.XMLTYPE)  

Created the following test procedure:

PROCEDURE INSERT_XML1 as
 nc CLOB;
begin
  nc := CHR(0) || CHR(20);
  INSERT INTO XMLTEST
  VALUES('Title1', XMLTYPE(nc));
end;

This will produce the same error that you get:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00216: invalid character 0 (0x0)
Error at line 1
ORA-06512: at "SYS.XMLTYPE", line 248
ORA-06512: at line 1
ORA-06512: at "CHRISTOPH.INSERT_XML1", line 48
ORA-06512: at line 2
Sorry, I overlooked p_in_attachment being CLOB; you will have to use

  xmlString := RAWTOHEX(TO_CHAR(p_in_attachment))

in your SP to make it work.

Hope this helps   Christoph
Hi, sorry i didn't get back to you over the weekend but i got swamped with things and didn't get near a machine.

I tried the above suggestion and was presented with the following error when i ran the procedure. I was using the line xmlString := RAWTOHEX(TO_CHAR(p_in_attachment))


Error encountered WHILE INSERTING message, error IS ORA-06502: PL/SQL: numeric or value error: hex to raw conversion error


Thanks.
Hi

Please post
- the SP you used for testing (or at least all relevant parts)
- the portion of your code that calls the SP
- the table description of TEMP_TEST_XML_DOC

relevnat portion of the sp
 l_message_no AC_INBOUND_MESSAGES.message_no%TYPE;
 xmlString VARCHAR2(4000);

BEGIN
  Ac_System_Errors_Pkg.SET_PROG_POSITION_PR('Get next value for sequence AC_INBOUND_MESSAGES_SEQ');

  dbms_output.put_Line(p_in_subject||'*'||p_in_sender||'*'||p_in_recepient||'*'||p_in_body);
 
 
  l_message_no :=  Ac_Sequences_Pkg.get_sequence_next_value_fn('AC_INBOUND_MESSAGES_SEQ');
 
  --Ac_System_Errors_Pkg.SET_PROG_POSITION_PR('Insert message no '||l_message_no||' into the ac_inbound_messages table.');
  --dbms_output.put_Line('write file');
  --AC_GENERIC_PKG.WRITE_CLOB_TO_FILE_PR('clob.xml',p_in_attachment);
  Ac_System_Errors_Pkg.SET_PROG_POSITION_PR('Insert new inbound message record.');
  dbms_output.put_Line('back to insert...');
  xmlString := RAWTOHEX(TO_CHAR(p_in_attachment));
  EXECUTE IMMEDIATE 'INSERT INTO TEMP_TEST_XML_DOC VALUES(:x)' USING xmlString;

  INSERT INTO AC_INBOUND_MESSAGES (
               MESSAGE_NO,
             SUBJECT,
             SENDER,
             RECIPIENT,
             BODY,
             ATTACHMENT,
             STATUS,
             CREATED_BY,
             CREATED_DATE )
 VALUES (
             l_message_no,
            p_in_subject,
            p_in_sender,
            p_in_recepient,
            p_in_body,
            XMLTYPE(p_in_attachment),
            'U',
            USER,
            SYSDATE );


The function is called from a stored java program.
insProcName = (CallableStatement) conn.prepareCall( "begin ac_inbound_messages_pkg.create_new_message_pr(?,?,?,?,?); end;");

The ? are populated correctly, i've checked that thoroughly.


dec of TEMP_TEST_XML_DOC
Name                                                  Null?    Type
----------------------------------------------------- -------- --------------
XML_TEXT                                                       VARCHAR2(4000)


Also I tried just inserting a euro symbol into the test table using a simple insert statement and when i selected it back, it came back as the following symbol.

¿


Thanks.
> Also I tried just inserting a euro symbol into the test table using a simple insert statement and when i selected it back, it came back as the following symbol.

> Â¿

Yes, I think I see the problem now.
Although you did not show the Java code to set the parameters, I assume that a UTF-16 value is being passed as CLOB.

As for the xmlString := RAWTOHEX(TO_CHAR(p_in_attachment)), I am sorry, did not try it myself until now.

  SELECT DUMP(SUBSTR(TO_CHAR(p_in_attachment), 1, 100)) INTO xmlString;

works. I tested it ;-)
Sorry, here's the java code to fill the statement.

System.out.println("ReceiveMail.Receive >> Create temp file "+ outputDir + "\\" + tempFileName);
                                    locName = "Open temp output file.";
                                    File tempFile  = new File(outputDir + "\\" + tempFileName);
      

                                    FileOutputStream oStream = new FileOutputStream(tempFile);
                                    int b;
                                    while ((b=is.read())!= -1)
                                    {
                                        System.out.print((char)b);
                                        oStream.write(b);
                                    }
                                    oStream.close();


                                    InputStream is1 = new BufferedInputStream(new FileInputStream(tempFile));

                                    insProcName.setString(1,subject);      
                                    insProcName.setString(2,from);      
                                    insProcName.setString(3,to);      
                                    insProcName.setString(4,body);      
                                    insProcName.setAsciiStream(5,is1,(int)is1.available());
                          
                      
                                    try
                                    {
                                        locName = "Call DB procedure to load file.";
                                        insProcName.execute();
                                        message.setFlag(Flags.Flag.DELETED, true);
                                        locName = "Delete temp file.";
                                        tempFile.delete();
                                    }

Thanks
Yes, but how is body constructed?
Do you now get something from DUMP?
Hi, the body of the message is set to a full-stop. only the attachment is read in from a file. The process is that,
1. mail is received and retrieved by the java program.
2. a temporary file is created for the attachment on the server. ( its just a copy of the XML document. Its in unicode format)
3. the temporary file is then read back by the code above and transferred into the database. (as long as there is no euro in the xml! ;)



Here is the results of the dump call:
Typ=1 Len=100 CharacterSet=WE8MSWIN1252: 127,126,60,0,63,0,120,0,109,0,108,0,32,0,118,0,101,0,114,0,115,0,105,0,111,0,110,0,61,0,34,0,49,0,46,0,48,0,34,0,32,0,63,0,62,0,13,0,10,0,60,0,65,0,66,0,65,0,83,0,89,0,78,0,67,0,95,0,66,0,85,0,83,0,73,0,78,0,69,0,83,0,83,0,95,0,79,0,66,0,74,0,69,0,67,0,84,0,32,0

and here is the corresponding file:
<?xml version="1.0" ?>
<ABASYNC_BUSINESS_OBJECT TYPE="SITE_ORDER">
      <SITE_ORDER>
<PROJECT_CODE>A95</PROJECT_CODE>
<SITE_ORDER_TYPE>S</SITE_ORDER_TYPE>
<SITE_ORDER_NO>A9515458S</SITE_ORDER_NO>
<SITE_ORDER_DATE>05/05/2005</SITE_ORDER_DATE>
<DELIVERY_DATE>06/05/2005</DELIVERY_DATE>
<SUPPLIER_CODE>ACEF01</SUPPLIER_CODE>
<REF>.</REF>
      <SITE_ORDER_DETAILS>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>1</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA27</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M6x65(Hilti HSA-K) @ €50</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
....
etc...

Thanks.
Ok, I am little bit out of my depth with Java, so I overlooked the setAsciiStream() part.
I would suggest that you use getCharacterStream().
I'll give that a try and get back to you.

Thanks.
I tried replacing the setAsciiStream() with the setCharacterStream(). No joy. I'm still getting the same xml parsing error.

Thanks.
Please post the DUMP output that you get with setCharacterStream.
Here's the results of the dump with the SetCharacterStream()

Typ=1 Len=100 CharacterSet=WE8MSWIN1252: 255,254,60,0,63,0,120,0,109,0,108,0,32,0,118,0,101,0,114,0,115,0,105,0,111,0,110,0,61,0,34,0,49,0,46,0,48,0,34,0,32,0,63,0,62,0,13,0,10,0,60,0,65,0,66,0,65,0,83,0,89,0,78,0,67,0,95,0,66,0,85,0,83,0,73,0,78,0,69,0,83,0,83,0,95,0,79,0,66,0,74,0,69,0,67,0,84,0,32,0

and here is the corresponding file:
<?xml version="1.0" ?>
<ABASYNC_BUSINESS_OBJECT TYPE="SITE_ORDER">
      <SITE_ORDER>
<PROJECT_CODE>A95</PROJECT_CODE>
<SITE_ORDER_TYPE>S</SITE_ORDER_TYPE>
<SITE_ORDER_NO>A9515458S</SITE_ORDER_NO>
<SITE_ORDER_DATE>05/05/2005</SITE_ORDER_DATE>
<DELIVERY_DATE>06/05/2005</DELIVERY_DATE>
<SUPPLIER_CODE>ACEF01</SUPPLIER_CODE>
<REF>.</REF>
      <SITE_ORDER_DETAILS>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>1</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA27</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M6x65(Hilti HSA-K) @ 50</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
<ORDER_QUANTITY>23</ORDER_QUANTITY>
      </SITE_ORDER_DETAIL>
      <SITE_ORDER_DETAIL>
<SITE_ORDER_DETAIL_NO>2</SITE_ORDER_DETAIL_NO>
<COST_CODE>MJ01</COST_CODE>
<MATERIAL_CODE>ANCHEXPA37</MATERIAL_CODE>
<MATERIAL_DESCRIPTION>Exp Anch M12x100(Hilti HSA-K)</MATERIAL_DESCRIPTION>
<UNIT>100</UNIT>
....
etc.


Thanks.
ASKER CERTIFIED SOLUTION
Avatar of Nievergelt
Nievergelt
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi

I would very much appreciate, if you could post a short comment on what worked for you in the end.

Share and Enjoy  Christoph
Hi,

It was changing the CLOB to an NCLOB that did it for me,
our database doesn't use many exotic characters, with the euro probably being the only one.

Anyways if the problem surfaces again, then I have this page as an extremely handy reference.

Thanks a million.
Emmet.
Hi Emmet

Thank you for your reply.

Glad to have been of assistance.
Christoph