Euro Sign using XML, VB, Java

I am using a VB client to pass data to a java servlet which processes the data (passed using MSXML3 on the client to Xerces-1_4_3 on the server) and writes it to an oracle database.  The problem I am having is with the euro symbol (€).   I convert it from € to € using Replace method in VB and then I pass this XML (UTF-16) to the server which is telling me the document is mal formed - Org.sax.parser exception.

I presume that Xerces supports the Euro symbol so I am kinda stuck.

Who is Participating?
yorenConnect With a Mentor Commented:
The Euro symbol is a valid XML character, and I verified that Xerces 1.4.4 handles it correctly. So, either it's a bug in version 1.4.3 (unlikely), or your document is not well-formed. Are you sure your document is encoded in UTF-16? If it is, are you properly declaring that with <?xml version='1.0' encoding='UTF-16'?> ?

You can probably debug this easier on the command line. Another option is to post your document here; I may be able to spot the problem.
The thing is that MSXML3 supports the windows-1252 encoding for which the Euro symbol is supported. I think on principle, you should not worry about the euro symbol per se as it has more to do with the presentation layer than the data layer. You can use namespaces to determine the datatype if you want to get down to partitioning the data. Do not use the symbol to figure out what kind of currency. That is more reserved for display purposes.

for e.g.
<?xml version="1.0" encoding="utf-8"?>
<price xmlns="uri:euro">1.35</price>
<price xmlns="uri:us">2.00</price>
<price xmlns="uri:uk">1.05</price>
in fact, by using namespaces here, it is pretty useful if you are using XSLT to display the data because you can sum prices for objects based on the namespace. in fact, using namespaces here is an excellent choice =))
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

What's the exact error you get from Xerces? Maybe the Euro symbol is not the problem.
adamgernonAuthor Commented:
It is definitely the Euro symbol that is causing the problem because when u enter in ascii text and call the server method using POST it succeeds fine however, if I just enter the symbol € it collapses.
Interestingly enough you have posted your message using hex 80 as the Euro symbol, yet you wish to use 8364 = hex 20AC which of course is the offical Unicode sign.

There are two cases :-

a) The insertion/substitution of the sign with the entity is incorrect so that the parser reports an error

b) hex 20AC is not a valid character since the parser supports only Unicode versio 2.0 (the 20AC came later).

I suspect a) is true. Try subsituting &#36; (which is dollar) instead just to check that you get the entity substitution correct. If that works try using &#128;
adamgernonAuthor Commented:
I have figured out that the problem is that on the client we are encoding using UTF-16 but on the Java Server where we are using Xerces DOM implementation I cannot manage to get the encoding to UTF-16. See my other question posted 05/28/2002.  Anyway, let me know if u can solve this little conundrum.
adamgernonAuthor Commented:
request for deletion of queston been here for over a year!!!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.