Solved

HTMLAgilitypack -> selectSingleNode producing the same text within a loop

Posted on 2014-01-31
4
1,685 Views
Last Modified: 2014-01-31
Hi all,

I have got the below script almost working but each time it loops through and produces the text withing td class 'home' it produces the exact same text each time....

For Each div As Object In htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']")' select all the divs within the code that contain *

		lblHTMLOutput.Text +=  div.("//td[@class='home']").innerText & "<br>"
        Next

Open in new window



Here is the HTML its loooking thorough

<div id="matches">

    <div style="display:block" class="matches" id="matcheshd">
            <a id="lnkCount" href="javascript:;">2 Matches</a>
            <a id="lnkHidden" href="javascript:;">Show Hidden (<span id="hRows">0</span>)</a>
            <a id="btnAudio" href="javascript:;"></a>
            <a id="btnTextSize" href="javascript:;"></a>
    </div>
            <div class="league-header">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td>
                            <img src='/content/images/flags/46.png' alt=''/>
                            <span>International - Club Friendly</span>
                        </td>
                    </tr>
                </table>
            </div>
            <div id="1084278" style="display:block" class="matches">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td class="Ma_A flag">
                            <a href="javascript:;" class="x">x</a> 
                            <span id="ln-1084278" style="position:relative;">INT CF
                                <span class="lfname">Club Friendly</span>
                            </span>
                            <input type="hidden" value="1" id="sts-1084278" />
                        </td>
                        <td class="time"> 
                                    <span id="ptime-1084278" class="timezone" style="position: relative;">
                                        32
                                    </span>
                                    <span class="kickoff">08:30</span>
                                <span class="date datetimezone">01/31/2014 08:30 GMT</span>
                                <span class="gmtdatetime" style="display:none">01/31/2014 08:30</span>
                        </td>
                        <td class="home">
                             <span id="hg-1084278">
                            </span>
                            <span id="hn-1084278">SWQ Thunder</span>
                            <span class="card" id="hc-1084278">0</span>
                        </td>
                        <td class="score">
                        
                                <a href="javascript:;" class="scorelink score" matchid="1084278" leaguetitle="Club Friendly" date="1/31/2014 8:30:00 AM">
                                <span id="hs-1084278">0</span> - 
                                <span id="as-1084278">2</span><br />
                                </a>
                        </td>
                        <td class="away">
                            <span id="an-1084278">Brisbane Wolves</span>
                            <span id="ag-1084278">
                            </span>
                            <span class="card" id="ac-1084278">0</span>
                        </td>
                        <td id="live-1084278" class="liveicon"></td>
                        <td class="icons">
                            <a href="/Video/1084278" class="video" title="highlights" style="visibility:hidden"></a>
                            <a id="mm-1084278" class="gccmymatch " href="javascript:;"></a>
                            <a id="setting-1084278" class="sicon" href="javascript:;" title="Pick your own sound"></a>
                        </td>
                    </tr>
                </table>
            </div>
            <div class="league-header">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td>
                            <img src='/content/images/flags/1.png' alt=''/>
                            <span>Australia - League A</span>
                        </td>
                    </tr>
                </table>
            </div>
            <div id="1018822" style="display:block" class="matches">
                <table cellpadding="0" cellspacing="0">
                    <tr>
                        <td class="Ma_A flag">
                            <a href="javascript:;" class="x">x</a> 
                            <span id="ln-1018822" style="position:relative;">AUS D1
                                <span class="lfname">League A</span>
                            </span>
                            <input type="hidden" value="1" id="sts-1018822" />
                        </td>
                        <td class="time"> 
                                    <span id="ptime-1018822" class="timezone" style="position: relative;">
                                        32
                                    </span>
                                    <span class="kickoff">08:30</span>
                                <span class="date datetimezone">01/31/2014 08:30 GMT</span>
                                <span class="gmtdatetime" style="display:none">01/31/2014 08:30</span>
                        </td>
                        <td class="home">
                             <span id="hg-1018822">
                            </span>
                            <span id="hn-1018822">Melbourne Heart FC</span>
                            <span class="card" id="hc-1018822">0</span>
                        </td>
                        <td class="score">
                        
                                <a href="javascript:;" class="scorelink score" matchid="1018822" leaguetitle="Australia League A" date="1/31/2014 8:30:00 AM">
                                <span id="hs-1018822">0</span> - 
                                <span id="as-1018822">0</span><br />
                                </a>
                        </td>
                        <td class="away">
                            <span id="an-1018822">Sydney FC</span>
                            <span id="ag-1018822">
                            </span>
                            <span class="card" id="ac-1018822">0</span>
                        </td>
                        <td id="live-1018822" class="liveicon"></td>
                        <td class="icons">
                            <a href="/Video/1018822" class="video" title="highlights" style="visibility:hidden"></a>
                            <a id="mm-1018822" class="gccmymatch " href="javascript:;"></a>
                            <a id="setting-1018822" class="sicon" href="javascript:;" title="Pick your own sound"></a>
                        </td>
                    </tr>
                </table>
            </div>

Open in new window



It loops through 3 times and each time displays

SWQ Thunder 0
SWQ Thunder 0
SWQ Thunder 0

when it should show

SWQ Thunder 0
Melbourne Heart FC 0
0
Comment
Question by:runnerjp2005
  • 2
4 Comments
 
LVL 23

Expert Comment

by:Ioannis Paraskevopoulos
ID: 39823459
Your code seems to be failing to compile.

In my example i have this:

For Each div as Object in htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']//td[@class='home']")
	console.writeline(div.InnerText.Trim)
Next

Open in new window


This only gets the required results.

Giannis
0
 

Author Comment

by:runnerjp2005
ID: 39823473
What if i want to display several things within the div -

 <td class="time">
 <td class="home">
 <td class="away">
<td class="score">
0
 
LVL 52

Expert Comment

by:Carl Tawn
ID: 39823586
Why not just make your initial XPath more explicit:
For Each div As Object In htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']//td[@class='home']")
    lblHTMLOutput.Text &= div.innerText & "<br>"
Next

Open in new window

0
 
LVL 23

Accepted Solution

by:
Ioannis Paraskevopoulos earned 500 total points
ID: 39823588
May i suggest that you use some LINQ:

	Dim htmlDoc = new HtmlDocument()
	htmlDoc.Load(sr)
	Dim attributes = New List(Of String)
	attributes.Add("time")
	attributes.Add("home")
	attributes.Add("away")
	attributes.Add("score")
	
	For Each div  in htmlDoc.DocumentNode.SelectNodes("//div[@class='matches']")
	   	Dim result = div.Descendants _
			.Where(Function(x) attributes.Contains(x.Attributes.Where (Function(a) a.Name = "class").Select(Function(a) a.Value).SingleOrDefault)) _
			.Select (Function(x) New With {.ClassName = x.Attributes.Where (Function(a) a.Name = "class").Select(Function(a) a.Value).SingleOrDefault,.Text = x.InnerText})
	   	For Each element as Object in result
			Console.WriteLine(element)
	   	Next
	Next

Open in new window


In the example  have a list of strings named attributes. In there a load all the classes i am interested to display.
Then using LINQ i get the results i need.


Giannis
0

Featured Post

Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Graphics 2 26
imap read mail 1 26
C# winforms programmitically move panels 6 26
Client Validating 2 date fields, required & comparison 1 19
Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
This is used to tweak the memory usage for your computer, it is used for servers more so than workstations but just be careful editing registry settings as it may cause irreversible results. I hold no responsibility for anything you do to the regist…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.

816 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now