Parse XML with multiple oddly declared namespaces using VB.NET

Hello Experts!

I'm attempting to parse an XML string using VB.NET 2010. The XML contains multiple namespaces which are declared in a way I haven't seen before. Rather than using a string literal for the URI, the prefix and the URI are the same. I need to be able to capture all of the values in the XML but I'm getting an error which reads:

'xyz' is an unexpected token. The expected token is '"' or '''. Line 7, position 26.

Here is a sample of the XML:
<COMPANYXYZ>
  <SYSTEM101>
    <HEADER>
      <ID>ABCD-101-ZYXW-99</ID>
    </HEADER>
    <BODY>
      <PROCESS xmlns:xyz=xyz>
        <xyz:JOBNUMBER>A523</xyz:JOBNUMBER>
        <xyz:BATCH>
          <xyz:STATUS>
            <xyz:CODE>0</xyz:CODE>
          </xyz:STATUS>
          <xyz:CARDINFO>
            <subSys:RESPONSE xmlns:subSys=subSys>
              <responseDetail NAME="STATUS">FINISHED</responseDetail>
              <responseDetail NAME="RESPONSE_MSG">The job finished.</responseDetail>
            </subSys:RESPONSE>
          </xyz:CARDINFO>
        </xyz:BATCH>
      </PROCESS>
    </BODY>
  </SYSTEM101>
</COMPANYXYZ>

Open in new window


Here is a sample of the VB code:
        Dim theXml As String = GetXmlString()

        Dim xmlDoc As XmlDocument = New XmlDocument()
        xmlDoc.LoadXml(theXml)

        Dim idNums As XmlNodeList
        idNums = xmlDoc.GetElementsByTagName("ID")
        Console.WriteLine("ID: " + idNums(0).InnerXml)

        Dim respDetails As XmlNodeList
        respDetails = xmlDoc.GetElementsByTagName("responseDetail")
        Dim i As Integer
        For i = 0 To respDetails.Count - 1
            Console.WriteLine(respDetails(i).InnerXml)
        Next

Open in new window


Details:
  • I didn't create the XML and have no control over how it is being sent to me.
  • The XML in the sample above correctly reflects the error I'm seeing - specifically the way the namespaces are declared - however I've edited the tags for security reasons. (The names have been changed to protect the innocent, so to speak.)
  • I don't have a lot of experience working with namespaces in XML in general in VB.
  • I've looked into the XmlNamespaceManager class but I haven't been able to get it to work in this scenario.
  • When I remove all references to namespaces from the XML that I'm being sent, the code I'm using works, however I'd rather not edit what I'm being sent.
  • The error is appearing as soon as the xmlDoc.LoadXml() method is called.
  • The VB code I'm using may not be the most efficient or best way of pulling out the data - it's just for demonstration and the code is failing before it reaches that point anyway. Please feel free to suggest a better method.

My question is, what do I need to do in order to parse this XML correctly?
LVL 2
Jeff EdmundsApplication Developer/SQL DBAAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

zc2Commented:
The XML is not well-formed, so, you can't load it to a standard parser.
The namespace declaration has not quotes around the URN. It should be:
<PROCESS xmlns:xyz="xyz">

Open in new window


Since you can't control the XML creation, I'd sugest to try to fix its well-formness before feed it to the parser. Load the XML to a string variable, then do replacing, like
xml_str.replace( "'xmlns:xyz=xyz", "xmlns:xyz='xyz'" )

Open in new window

I am not sure about the exact syntax of the replace() method in the CLR, but you should get the idea.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Gren MusEAI specialistCommented:
You need to first welform your xml programmatically by giving quotes to both the xmlns values. So the output xml will be like following which will be input to your code.

<COMPANYXYZ>
  <SYSTEM101>
    <HEADER>
      <ID>ABCD-101-ZYXW-99</ID>
    </HEADER>
    <BODY>
      <PROCESS xmlns:xyz="xyz">
        <xyz:JOBNUMBER>A523</xyz:JOBNUMBER>
        <xyz:BATCH>
          <xyz:STATUS>
            <xyz:CODE>0</xyz:CODE>
          </xyz:STATUS>
          <xyz:CARDINFO>
            <subSys:RESPONSE xmlns:subSys="subSys">
              <responseDetail NAME="STATUS">FINISHED</responseDetail>
              <responseDetail NAME="RESPONSE_MSG">The job finished.</responseDetail>
            </subSys:RESPONSE>
          </xyz:CARDINFO>
        </xyz:BATCH>
      </PROCESS>
    </BODY>
  </SYSTEM101>
</COMPANYXYZ>
0
Jeff EdmundsApplication Developer/SQL DBAAuthor Commented:

I am not sure about the exact syntax of the replace() method in the CLR, but you should get the idea.

This worked:
            theXML = theXML.Replace("<PROCESS xmlns:xyz=xyz>", "<PROCESS xmlns:xyz=""xyz"">")
            theXML = theXML.Replace("<subSys:RESPONSE xmlns:subSys=subSys>", "<subSys:RESPONSE xmlns:subSys=""subSys"">")

Open in new window


Thanks for your help!
0
Jeff EdmundsApplication Developer/SQL DBAAuthor Commented:
Thanks for your help!
0
zc2Commented:
You are welcome!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic.NET

From novice to tech pro — start learning today.