We help IT Professionals succeed at work.

XML/XPath Question

Besikci
Besikci asked
on
Hi experts,

I have a read only XML file (converted from HTML) as follows:

<root>
  	<div class="league">
		  <h1>Premiership</h1>
	</div>
	<table class="matches">
		<tr class="match">
			<td>Man. Utd-Chelsea</td>
		</tr>
		<tr class="match">
			<td>Arsenal-Liverpool</td>
		</tr>
		<tr class="match">
			<td>Tottenham-Man.City</td>
		</tr>
	</table>
	<div class="league">
		  <h1>World Cup 2010</h1>
	</div>
	<table class="matches">
		<tr class="match">
			<td>England-USA</td>
		</tr>
	</table>
	<div class="league">
		  <h1>Swedish Superettan</h1>
	</div>
	<table class="matches">
		<tr class="match">
			<td>IFK Norrkoping-Orgryte IS</td>
		</tr>
	</table>
...
</root>

Open in new window


I'm trying to add leauges and matches together to a datagrid in VB.Net. I managed to add leagues only or matches only to the datagrid. (by iterating through div or table nodes). What I want is as follows:

Premiership                   Man. Utd-Chelsea
Premiership                   Arsenal-Liverpool
Premiership                   Tottenham-Man.City
World Cup 2010            England-USA
Swedish Superettan     IFK Norrkoping-Orgryte IS

I can get the matches if I iterate through using  /root/table[@class='matches'] or leauges /root/div[@class='leauge'].

How do I get them together?

Many thanks in advance.
Comment
Watch Question

Most Valuable Expert 2011
Top Expert 2015

Commented:
I don't think you will get reliable results with your current data. As posted, there is nothing tying a particular match to a particular league, aside from the order in which they appear in the document. There really should be some node surrounding each respective pair of leagues and matches.

For example, adding a <div></div> around each set of league/matches pairs, you could use the following XPath query to get the results demonstrated in the attached screenshot:
for $match in /root/div/table[@class="matches"]/tr[@class="match"]/td return concat($match/../../../div[@class="league"]/h1/text(), " - ", $match/text())

Open in new window

untitled.JPG
Most Valuable Expert 2011
Top Expert 2015
Commented:
Actually, I don't think that approach is going to work for you in code (the XPath). I still think you should consider delimiting the sets of leagues/matches accordingly, though.

Here is an example in code of how you could access the values of the nodes and pair them for use in your DataGrid. I am not specifically using a DataGrid, but this should give you an idea of accessing the nodes.
Sub Main()
    Dim xdoc As New XmlDocument
    Dim nodes As XmlNodeList

    xdoc.Load("test.xml")

    For Each node As XmlNode In xdoc.SelectNodes("/root/div")
        Dim leagueNode As XmlNode = node.SelectSingleNode("div[@class='league']/h1")
        Dim matchNodes As XmlNodeList = node.SelectNodes("table[@class='matches']/tr[@class='match']/td")

        For Each mNode As XmlNode In matchNodes
            Console.WriteLine("{0} - {1}", leagueNode.InnerText, mNode.InnerText)
        Next
    Next

    Console.ReadKey()
End Sub

Open in new window

Author

Commented:
No complete solution but it was useful enough