Philippe Renaud
asked on
Help with XPath select
Hello EE,
In my vb.net code, I do this to retrieves nodes :
now lets say there are 2 td class 'myProduct' ..
both <td> will have all the same nodes because they are repeted for every product right so far...
if I do :
well because there are 2 same node in the Htmldocument (because of the 2 products with nodes name repeated) I will always get the same price duplicated because Im pretty sure my code passes on the same twice.. I need a way to remove or skip every time i go to next product... or to create a list of(integer) and I would go at item.index + 1 you know ?
... can you help me ? I hope my question is clear..
Thanks
In my vb.net code, I do this to retrieves nodes :
Dim nodesTC As HtmlNodeCollection = docTC.DocumentNode.SelectNodes("//td[@class='myproduct']")
now lets say there are 2 td class 'myProduct' ..
both <td> will have all the same nodes because they are repeted for every product right so far...
if I do :
For Each nodeTC As HtmlNode In nodesTC
MesssageBox.Show(nodeTC.SelectSingleNode("//span[@itemprop='price']").InnerText.Trim())
Next
well because there are 2 same node in the Htmldocument (because of the 2 products with nodes name repeated) I will always get the same price duplicated because Im pretty sure my code passes on the same twice.. I need a way to remove or skip every time i go to next product... or to create a list of(integer) and I would go at item.index + 1 you know ?
... can you help me ? I hope my question is clear..
Thanks
Can you provide a sample of the XML (doesn't need to be production data--just the structure is relevant)?
@zc2
because it always searches from the document's root.That is not accurate. "//" searches relative to node it is executed against. This may or may not be the document root.
ASKER
ok let me Tab correctly the html its all messed up. brb
If it's well-formed (as in XML-well-formed), then you can select File->New, and then select "XML File". Paste the HTML into the new file, then do a Ctrl+E+D combination to format the document.
That is not accurate. "//" searches relative to node it is executed against. This may or may not be the document root.I'm sorry, but this is not true. "//" always searches from the root.
".//" searches from the current node.
ASKER
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
@zc2
Xml
Code
You seem to be going by the W3C spec. So to fit that, you can think of the currently selected node as the document root.
**Note: When I say "current" node, I am not talking about the node so far as XPath is concerned; rather I am talking about the HtmlNode/XmlNode in VB that you have selected. I can see that wasn't quite clear in my previous comment. As the original code was written, the "//" would not search from the actual document's root. So far as the spec is concerned, yes "//" searches from the document root. When it comes to using it in .NET, however, each time you select something into an XmlNode object, that node object effectively becomes its own document.
I'm sorry, but this is not true. "//" always searches from the root.If you run "//" against a child node, then it certainly does function the way I mentioned. See for yourself:
Xml
<root>
<child1>
<data>This is what I'm after</data>
</child1>
<data>This is not what I'm after</data>
</root>
Code
Imports System.Xml
Module Module1
Sub Main()
Dim xdoc As New XmlDocument
xdoc.Load("xmlfile1.xml")
For Each node As XmlNode In xdoc.SelectNodes("//child1")
Dim targetChild As XmlNode = node.SelectSingleNode("//data")
' By your logic, both "This is what I'm after" AND
' "This is not what I'm after" should display. Only the first displays.
Console.WriteLine(targetChild.InnerText)
Next
End Sub
End Module
You seem to be going by the W3C spec. So to fit that, you can think of the currently selected node as the document root.
**Note: When I say "current" node, I am not talking about the node so far as XPath is concerned; rather I am talking about the HtmlNode/XmlNode in VB that you have selected. I can see that wasn't quite clear in my previous comment. As the original code was written, the "//" would not search from the actual document's root. So far as the spec is concerned, yes "//" searches from the document root. When it comes to using it in .NET, however, each time you select something into an XmlNode object, that node object effectively becomes its own document.
@PhilippeRenaud
Are you using Html Agility Pack?
Are you using Html Agility Pack?
ASKER
Yes kaufmed, i use the pack
You seem to be going by the W3C spec. So to fit that, you can think of the currently selected node as the document root.Ok, probably you're right. I don't works with .net, and when I work with MSXML from VBscript, I always execute .setProperty "SelectionLanguage", "XPath" before I do any selection.
OK, are you able to correlate zc2's example to fit your scenario? It should be along the lines of what you are trying to achieve.
ASKER
thanks
ASKER
can u guys help me with that one ?
https://www.experts-exchange.com/questions/28054658/Help-with-DataGridView-data.html
https://www.experts-exchange.com/questions/28054658/Help-with-DataGridView-data.html
"span[@itemprop='price']"