• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1897
  • Last Modified:

HTMLAgilitypack -> selectSingleNode producing the same text within a loop

Hi all,

I have got the below script almost working but each time it loops through and produces the text withing td class 'home' it produces the exact same text each time....

For Each div As Object In htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']")' select all the divs within the code that contain *

		lblHTMLOutput.Text +=  div.("//td[@class='home']").innerText & "<br>"
        Next

Open in new window



Here is the HTML its loooking thorough

<div id="matches">

    <div style="display:block" class="matches" id="matcheshd">
            <a id="lnkCount" href="javascript:;">2 Matches</a>
            <a id="lnkHidden" href="javascript:;">Show Hidden (<span id="hRows">0</span>)</a>
            <a id="btnAudio" href="javascript:;"></a>
            <a id="btnTextSize" href="javascript:;"></a>
    </div>
            <div class="league-header">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td>
                            <img src='/content/images/flags/46.png' alt=''/>
                            <span>International - Club Friendly</span>
                        </td>
                    </tr>
                </table>
            </div>
            <div id="1084278" style="display:block" class="matches">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td class="Ma_A flag">
                            <a href="javascript:;" class="x">x</a> 
                            <span id="ln-1084278" style="position:relative;">INT CF
                                <span class="lfname">Club Friendly</span>
                            </span>
                            <input type="hidden" value="1" id="sts-1084278" />
                        </td>
                        <td class="time"> 
                                    <span id="ptime-1084278" class="timezone" style="position: relative;">
                                        32
                                    </span>
                                    <span class="kickoff">08:30</span>
                                <span class="date datetimezone">01/31/2014 08:30 GMT</span>
                                <span class="gmtdatetime" style="display:none">01/31/2014 08:30</span>
                        </td>
                        <td class="home">
                             <span id="hg-1084278">
                            </span>
                            <span id="hn-1084278">SWQ Thunder</span>
                            <span class="card" id="hc-1084278">0</span>
                        </td>
                        <td class="score">
                        
                                <a href="javascript:;" class="scorelink score" matchid="1084278" leaguetitle="Club Friendly" date="1/31/2014 8:30:00 AM">
                                <span id="hs-1084278">0</span> - 
                                <span id="as-1084278">2</span><br />
                                </a>
                        </td>
                        <td class="away">
                            <span id="an-1084278">Brisbane Wolves</span>
                            <span id="ag-1084278">
                            </span>
                            <span class="card" id="ac-1084278">0</span>
                        </td>
                        <td id="live-1084278" class="liveicon"></td>
                        <td class="icons">
                            <a href="/Video/1084278" class="video" title="highlights" style="visibility:hidden"></a>
                            <a id="mm-1084278" class="gccmymatch " href="javascript:;"></a>
                            <a id="setting-1084278" class="sicon" href="javascript:;" title="Pick your own sound"></a>
                        </td>
                    </tr>
                </table>
            </div>
            <div class="league-header">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td>
                            <img src='/content/images/flags/1.png' alt=''/>
                            <span>Australia - League A</span>
                        </td>
                    </tr>
                </table>
            </div>
            <div id="1018822" style="display:block" class="matches">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td class="Ma_A flag">
                            <a href="javascript:;" class="x">x</a> 
                            <span id="ln-1018822" style="position:relative;">AUS D1
                                <span class="lfname">League A</span>
                            </span>
                            <input type="hidden" value="1" id="sts-1018822" />
                        </td>
                        <td class="time"> 
                                    <span id="ptime-1018822" class="timezone" style="position: relative;">
                                        32
                                    </span>
                                    <span class="kickoff">08:30</span>
                                <span class="date datetimezone">01/31/2014 08:30 GMT</span>
                                <span class="gmtdatetime" style="display:none">01/31/2014 08:30</span>
                        </td>
                        <td class="home">
                             <span id="hg-1018822">
                            </span>
                            <span id="hn-1018822">Melbourne Heart FC</span>
                            <span class="card" id="hc-1018822">0</span>
                        </td>
                        <td class="score">
                        
                                <a href="javascript:;" class="scorelink score" matchid="1018822" leaguetitle="Australia League A" date="1/31/2014 8:30:00 AM">
                                <span id="hs-1018822">0</span> - 
                                <span id="as-1018822">0</span><br />
                                </a>
                        </td>
                        <td class="away">
                            <span id="an-1018822">Sydney FC</span>
                            <span id="ag-1018822">
                            </span>
                            <span class="card" id="ac-1018822">0</span>
                        </td>
                        <td id="live-1018822" class="liveicon"></td>
                        <td class="icons">
                            <a href="/Video/1018822" class="video" title="highlights" style="visibility:hidden"></a>
                            <a id="mm-1018822" class="gccmymatch " href="javascript:;"></a>
                            <a id="setting-1018822" class="sicon" href="javascript:;" title="Pick your own sound"></a>
                        </td>
                    </tr>
                </table>
            </div>

Open in new window



It loops through 3 times and each time displays

SWQ Thunder 0
SWQ Thunder 0
SWQ Thunder 0

when it should show

SWQ Thunder 0
Melbourne Heart FC 0
0
runnerjp2005
Asked:
runnerjp2005
  • 2
1 Solution
 
Ioannis ParaskevopoulosCommented:
Your code seems to be failing to compile.

In my example i have this:

For Each div as Object in htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']//td[@class='home']")
	console.writeline(div.InnerText.Trim)
Next

Open in new window


This only gets the required results.

Giannis
0
 
runnerjp2005Author Commented:
What if i want to display several things within the div -

 <td class="time">
 <td class="home">
 <td class="away">
<td class="score">
0
 
Carl TawnSystems and Integration DeveloperCommented:
Why not just make your initial XPath more explicit:
For Each div As Object In htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']//td[@class='home']")
    lblHTMLOutput.Text &= div.innerText & "<br>"
Next

Open in new window

0
 
Ioannis ParaskevopoulosCommented:
May i suggest that you use some LINQ:

	Dim htmlDoc = new HtmlDocument()
	htmlDoc.Load(sr)
	Dim attributes = New List(Of String)
	attributes.Add("time")
	attributes.Add("home")
	attributes.Add("away")
	attributes.Add("score")
	
	For Each div  in htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']")
	   	Dim result = div.Descendants _
			.Where(Function(x) attributes.Contains(x.Attributes.Where (Function(a) a.Name = "class").Select(Function(a) a.Value).SingleOrDefault)) _
			.Select (Function(x) New With {.ClassName = x.Attributes.Where (Function(a) a.Name = "class").Select(Function(a) a.Value).SingleOrDefault,.Text = x.InnerText})
	   	For Each element as Object in result
			Console.WriteLine(element)
	   	Next
	Next

Open in new window


In the example  have a list of strings named attributes. In there a load all the classes i am interested to display.
Then using LINQ i get the results i need.


Giannis
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now