Link to home
Start Free TrialLog in
Avatar of trevor1940
trevor1940

asked on

C#: HtmlAgilityPack getting elements using Xpath

I'm using HtmlAgilityPack to Travers some  HTML   VidsTest.html

In the code bellow I cannot get to the second video block with  <div class='post_title'>Title 2 The Wombles
Even though  the ID of "user_post_2712102" is being set Title and other variables remain at Title 1 Doctor Who

    HtmlAgilityPack.HtmlDocument HTMLdoc = new HtmlAgilityPack.HtmlDocument();
            HTMLdoc.Load(@"E:\path\to\VidsTest.html");
            var user_postDiv = HTMLdoc.DocumentNode.SelectNodes("//div[contains(@class,'user_post_')]");
            string id = "";
            string Title = "";
            string VidSrc = "";
            string Poster = "";
            string postDate = "";
            foreach (var divNodes in user_postDiv)
            {
                id = divNodes.GetAttributeValue("item_id", "").ToString();
                if (id != "")
                {
                    Console.WriteLine("id = user_post_" + id);
                
                //    postDate = divNodes.SelectSingleNode("//span[@class='local - time']").InnerHtml;
                // can't find postDate
                Title = divNodes.SelectSingleNode("//div[@class='post_title']").InnerText;

                VidSrc = divNodes.SelectSingleNode("//video/source").Attributes["src"].Value;
                Poster = divNodes.SelectSingleNode("//video").Attributes["poster"].Value;
                    Console.WriteLine("Title end div in divs {0} , {1} , {2} " + Title + Poster + VidSrc);
                }
                
            }
           Console.WriteLine("I'm Done");
            Console.ReadLine();

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of trevor1940
trevor1940

ASKER

Hi
I didn't know "//" searched the whole document and ".//" searched from the current node

FYI
I had to capture  using if block otherwise I got unhandled exceptions

 
                    if(divNodes.SelectSingleNode(".//div[@class='post_title']") != null)
                    {
                        Title = divNodes.SelectSingleNode(".//div[@class='post_title']").InnerText;
                    }

Open in new window