C#: HtmlAgilityPack getting elements using Xpath

trevor1940
trevor1940 used Ask the Experts™
on
I'm using HtmlAgilityPack to Travers some  HTML   VidsTest.html

In the code bellow I cannot get to the second video block with  <div class='post_title'>Title 2 The Wombles
Even though  the ID of "user_post_2712102" is being set Title and other variables remain at Title 1 Doctor Who

    HtmlAgilityPack.HtmlDocument HTMLdoc = new HtmlAgilityPack.HtmlDocument();
            HTMLdoc.Load(@"E:\path\to\VidsTest.html");
            var user_postDiv = HTMLdoc.DocumentNode.SelectNodes("//div[contains(@class,'user_post_')]");
            string id = "";
            string Title = "";
            string VidSrc = "";
            string Poster = "";
            string postDate = "";
            foreach (var divNodes in user_postDiv)
            {
                id = divNodes.GetAttributeValue("item_id", "").ToString();
                if (id != "")
                {
                    Console.WriteLine("id = user_post_" + id);
                
                //    postDate = divNodes.SelectSingleNode("//span[@class='local - time']").InnerHtml;
                // can't find postDate
                Title = divNodes.SelectSingleNode("//div[@class='post_title']").InnerText;

                VidSrc = divNodes.SelectSingleNode("//video/source").Attributes["src"].Value;
                Poster = divNodes.SelectSingleNode("//video").Attributes["poster"].Value;
                    Console.WriteLine("Title end div in divs {0} , {1} , {2} " + Title + Poster + VidSrc);
                }
                
            }
           Console.WriteLine("I'm Done");
            Console.ReadLine();

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Commented:
It's probably because when you start the xpath with //, your searching the entire structure and you're just finding the same node over and over again. You're not actually traversing the DOM or searching inside just that element.

Use .// Instead.

Author

Commented:
Hi
I didn't know "//" searched the whole document and ".//" searched from the current node

FYI
I had to capture  using if block otherwise I got unhandled exceptions

 
                    if(divNodes.SelectSingleNode(".//div[@class='post_title']") != null)
                    {
                        Title = divNodes.SelectSingleNode(".//div[@class='post_title']").InnerText;
                    }

Open in new window

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial