Clif
asked on
XML Read Error: There is no Unicode byte order mark
I'm trying to read an XML file that our customer will be sending us weekly. I'm writing code to read it as below:
However I'm getting an error:
"There is no Unicode byte order mark"
Researching the issue, it seems like the first line of the XML
<?xml version="1.0" encoding="utf-16"?>
Should be
<?xml version="1.0">
However, I can't ask the customer to change, I need to take the encoding tag into account.
So, how do I do this within the code sample above?
TIA
Here's a sample of the XML file:
Dim sWorkRequested As String
Dim sPoNumber As String
Dim m_xmld As XmlDocument
Dim m_nodelist As XmlNodeList
Dim m_node As XmlNode
m_xmld = New XmlDocument
m_xmld.Load("JobList.xml") '<---Error occurs here
m_nodelist = m_xmld.SelectNodes("/JobRequest")
For Each m_node In m_nodelist
sWorkRequested = m_node.Item("WorkRequested").InnerText
sPoNumber = m_node.Item("PoNumber").InnerText
Debug.Print("sWorkRequested" & sWorkRequested)
Debug.Print("sPoNumber" & sPoNumber)
Next
However I'm getting an error:
"There is no Unicode byte order mark"
Researching the issue, it seems like the first line of the XML
<?xml version="1.0" encoding="utf-16"?>
Should be
<?xml version="1.0">
However, I can't ask the customer to change, I need to take the encoding tag into account.
So, how do I do this within the code sample above?
TIA
Here's a sample of the XML file:
<?xml version="1.0" encoding="utf-16"?>
<JobRequest>
<WorkRequested>Structural Analysis</WorkRequested>
<PoNumber>POZ000000076219</PoNumber>
<PoAmount>500</PoAmount>
</JobRequest>
Is this document really UTF-16 encoded? If so, then it needs byte order marks. Otherwise, you need to ask that the source application change the encoding type. Generally speaking, UTF-8 is probably what you're after.
ASKER
If, just for grins & giggles, I take out the (encoding="utf-16") tag, the code reads the file correctly.
Exactly. If you specify UTF-16 in your document, then byte order marks are required. Your document does not provide them, hence the error.
ASKER
I understand the error, as I said in the OP, I have done some research.
The question is, how to resolve the error without asking the customer to change their file (the customer is always right)?
In my research, I've seem some suggestions at a solution, but they were either in C# or I did not understand what was being suggested.
The question is, how to resolve the error without asking the customer to change their file (the customer is always right)?
In my research, I've seem some suggestions at a solution, but they were either in C# or I did not understand what was being suggested.
Can you ask the customer if they intended to send the XML file with the utf field marked with uft-16 when the file they are transmitting is not in that format. Maybe they are not aware that the file is being transmitted in the wrong format. If they say that they are aware of that and it is the way they wish of doing it then you can write code to open the file in text mode and change the utf-16 to utf-8 and save the file back to the file system and then process your XML file as normal.
If it works without it, then strip it out:
e.g.
From a design perspective, you are coding around bad data rather than fixing the bad data. If the user decides to properly encode the file at some point without telling you, then your new logic breaks.
e.g.
Using buffer As New System.IO.StringWriter()
Using reader As New System.IO.StreamReader("input.xml")
Dim xmlDeclaration As String = reader.ReadLine()
xmlDeclaration = xmlDeclaration.Replace("encoding=""utf-16""", String.Empty)
buffer.WriteLine(xmlDeclaration)
While Not reader.EndOfStream buffer.WriteLine(reader.ReadLine())
End Using
Dim moddedXml As String = buffer.ToString()
Dim xdoc As New System.Xml.XmlDocument()
xdoc.LoadXml(moddedXml)
End Using
From a design perspective, you are coding around bad data rather than fixing the bad data. If the user decides to properly encode the file at some point without telling you, then your new logic breaks.
ASKER
FernandoSoto,
Unfortunately I cannot ask the customer to change their coding. Apparently it works for their other vendors.
kaufmed,
Your code has an error (While not ended)
If it should be UTF-8, why not do this in your code:
xmlDeclaration = xmlDeclaration.Replace("en coding=""u tf-16""", "encoding=""utf-8""")
Would that not solve the issue of it breaking should the customer suddenly decide to code it correctly?
I await your "fixed" code.
Unfortunately I cannot ask the customer to change their coding. Apparently it works for their other vendors.
kaufmed,
Your code has an error (While not ended)
If it should be UTF-8, why not do this in your code:
xmlDeclaration = xmlDeclaration.Replace("en
Would that not solve the issue of it breaking should the customer suddenly decide to code it correctly?
I await your "fixed" code.
Hi Clif;
My post was not asking you to tell the customer that they are doing it wrong but only to verify that the utf-16 is not matching what they are sending out. If they say this is what they want then all is good and fine, you have documentation of this is, it is called CYA. Otherwise just code around the badly formed XML document as I stated in my last post. Just remember as @kaufmed stated, if they correct the issues in the future that will break your code
My post was not asking you to tell the customer that they are doing it wrong but only to verify that the utf-16 is not matching what they are sending out. If they say this is what they want then all is good and fine, you have documentation of this is, it is called CYA. Otherwise just code around the badly formed XML document as I stated in my last post. Just remember as @kaufmed stated, if they correct the issues in the future that will break your code
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
It works. I'll keep your concerns in mind.
Thanks.
Thanks.