Solved

Help with XPath select

Posted on 2013-01-18
15
666 Views
Last Modified: 2013-03-06
Hello EE,

In my vb.net code, I do this to retrieves nodes :

Dim nodesTC As HtmlNodeCollection = docTC.DocumentNode.SelectNodes("//td[@class='myproduct']")

Open in new window


now lets say there are 2 td class 'myProduct' ..

both <td> will have all the same nodes because they are repeted for every product right so far...

if I do :

For Each nodeTC As HtmlNode In nodesTC
MesssageBox.Show(nodeTC.SelectSingleNode("//span[@itemprop='price']").InnerText.Trim())
Next

Open in new window


well because there are 2 same node in the Htmldocument (because of the 2 products with nodes name repeated) I will always get the same price duplicated because Im pretty sure my code passes on the same twice.. I need a way to remove or skip every time i go to next product... or to create a list of(integer) and I would go at item.index + 1 you know ?

 ... can you help me ? I hope my question is clear..


Thanks
0
Comment
Question by:PhilippeRenaud
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 4
15 Comments
 
LVL 18

Expert Comment

by:zc2
ID: 38794842
"//span[@itemprop='price']" selects always the same node, because it always searches from the document's root. Remove the '//' to make it start searching from the "nodeTC" element:
"span[@itemprop='price']"
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38794848
Can you provide a sample of the XML (doesn't need to be production data--just the structure is relevant)?
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38794851
@zc2
because it always searches from the document's root.
That is not accurate. "//" searches relative to node it is executed against. This may or may not be the document root.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Author Comment

by:PhilippeRenaud
ID: 38794864
ok let me Tab correctly the html its all messed up. brb
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38794884
If it's well-formed (as in XML-well-formed), then you can select File->New, and then select "XML File". Paste the HTML into the new file, then do a Ctrl+E+D combination to format the document.
0
 
LVL 18

Expert Comment

by:zc2
ID: 38794914
That is not accurate. "//" searches relative to node it is executed against. This may or may not be the document root.
I'm sorry, but this is not true. "//" always searches from the root.
".//" searches from the current node.
0
 
LVL 1

Author Comment

by:PhilippeRenaud
ID: 38794939
have a look lets say I want the value of offerCount

there are:  3416  and also 33
XMLFile1.xml
0
 
LVL 18

Accepted Solution

by:
zc2 earned 500 total points
ID: 38794966
The code below enumerates all the <td class='eventTickets'> and for each gets the <span itemprop="offerCount"> value
Dim nodesTC As HtmlNodeCollection = docTC.DocumentNode.SelectNodes("//td[@class='eventTickets']")

For Each nodeTC As HtmlNode In nodesTC
MesssageBox.Show(nodeTC.SelectSingleNode("span[@itemprop='offerCount']").InnerText.Trim())
Next

Open in new window

0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38795037
@zc2
I'm sorry, but this is not true. "//" always searches from the root.
If you run "//" against a child node, then it certainly does function the way I mentioned. See for yourself:

Xml
<root>
  <child1>
    <data>This is what I'm after</data>
  </child1>
  <data>This is not what I'm after</data>
</root>

Open in new window


Code
Imports System.Xml

Module Module1

    Sub Main()
        Dim xdoc As New XmlDocument

        xdoc.Load("xmlfile1.xml")

        For Each node As XmlNode In xdoc.SelectNodes("//child1")
            Dim targetChild As XmlNode = node.SelectSingleNode("//data") 

            ' By your logic, both "This is what I'm after" AND 
            ' "This is not what I'm after" should display. Only the first displays.
            Console.WriteLine(targetChild.InnerText)
        Next
    End Sub

End Module

Open in new window


You seem to be going by the W3C spec. So to fit that, you can think of the currently selected node as the document root.

**Note:  When I say "current" node, I am not talking about the node so far as XPath is concerned; rather I am talking about the HtmlNode/XmlNode in VB that you have selected. I can see that wasn't quite clear in my previous comment. As the original code was written, the "//" would not search from the actual document's root.  So far as the spec is concerned, yes "//" searches from the document root. When it comes to using it in .NET, however, each time you select something into an XmlNode object, that node object effectively becomes its own document.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38795043
@PhilippeRenaud

Are you using Html Agility Pack?
0
 
LVL 1

Author Comment

by:PhilippeRenaud
ID: 38795067
Yes kaufmed, i use the pack
0
 
LVL 18

Expert Comment

by:zc2
ID: 38795074
You seem to be going by the W3C spec. So to fit that, you can think of the currently selected node as the document root.
Ok, probably you're right. I don't works with .net, and when I work with MSXML from VBscript, I always execute .setProperty "SelectionLanguage", "XPath" before I do any selection.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38795100
OK, are you able to correlate zc2's example to fit your scenario? It should be along the lines of what you are trying to achieve.
0
 
LVL 1

Author Closing Comment

by:PhilippeRenaud
ID: 38795422
thanks
0
 
LVL 1

Author Comment

by:PhilippeRenaud
ID: 38958930
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Why use this lambda? 12 62
Need a modeling tool 2 41
c#, case, if 4 18
C# LINQ 5 24
Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question