• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2146
  • Last Modified:

How can I read an html page using HtmlAgilityPack in C#

I am trying to read an HTML page ( .htm page) , which is on my local drive using HtmlAgilityPack in C#.

Here are the things which i did.

1. Using Visual Studio 2012, first i installed HtmlAgilityPack using Package Manager Console -- NuGet.
2. It added HtmlAgilityPack  dll to my project.
3. Here is my code. I started running my code in debug mode, when it reached the below line
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
4. I got an error saying that
No Source available
There no source code available for the current location.

I am confused here. What source code it is looking for and Why it is looking for the source code as we already have the dll attached to the project

So, here are my questions on this issue

1. What is this error means. If it is looking for the source code, how can i get it.
2. How can i get the same source code for the HtmlAgilityPack which it was installed
3. how can I make it available to my application
4. how can i read the html tables

or is there any different approach that i can use to read the tables on the html page
          try
            {
                DirectoryInfo theFolder = new DirectoryInfo("\\\\MYPC\\Users\\Desktop");
                System.IO.FileInfo[] file = theFolder.GetFiles();
                int len = file.Length;
                if (file.Length > 1)
                {
                    int intLength;
                    fileName = Convert.ToString(file.GetValue(0));
                    intLength = fileName.IndexOf("_");
                }
                string FileName = "\\\\MYPC\\Users\\Desktop" + fileName;
                // Load the html document
                HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
                doc.Load(FileName);
                // Get all tables in the document
                HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//TABLE");

               
                HtmlNodeCollection rows = tables(0).SelectNodes(".//TR");
                for (int i = 0; i < rows.Count; ++i)
                {
                   
                    HtmlNodeCollection cols = rows(i).SelectNodes(".//TD");
                    for (int j = 0; j < cols.Count; ++j)
                    {
                        // Get the value of the column and print it
                        string value = cols(j).InnerText;
                        Console.WriteLine(value);
                    }
                }
            }
            catch (Exception objError)
            {
                throw objError;
            }
0
pothireddysunil
Asked:
pothireddysunil
  • 3
1 Solution
 
käµfm³d 👽Commented:
What is this error means. If it is looking for the source code, how can i get it.
An exception occurred within the HtmlAgilityPack (HAP) library. Basically, since you are running your code in debug mode within Visual Studio, VS is trying to display to you the actual source code that raised the error, but it cannot do that without the .pdb files that were generated when HAP was built. Since VS cannot find those files, it cannot display the source code, and thus displays the error that you have seen. There is nothing wrong with your code, per se; rather something has happened within HAP. I don't know that HAP distributes its .pdb files with the binaries. (It may.)

How can i get the same source code for the HtmlAgilityPack which it was installed
Most likely you will have to download the source code and build the library yourself. Then you will have the .pdb files that VS uses during debugging. The source code is available on CodePlex.

----------------------

You should be able to continue past the "no source code available" window by clicking "Cancel". Then you can see what the exception actually is. You may not be passing a parameter correctly or similar.
0
 
pothireddysunilAuthor Commented:
Thanks Kaufmed. I tried, it's not giving me an option to cancel and move ahead with my debugging.
0
 
pothireddysunilAuthor Commented:
It asks me to select the source code and once i selects it gives me this alert.

Source file:
 C:......\HtmlDocument.cs
 Module: c:\users........\Debug\HtmlAgilityPacl.dll
 Process:,,.exe

 The source file is different from when the module was built. Would you like to use it anyway?

It gives this alert for couple of ore classes and finally throws the below error.

Source file information;

 Locating source for 'd:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\HtmlDocument.PathMethods.cs'. Checksum: MD5 {21 f3 9f 31 c1 6a 76 67 a7 c1 d8 6f 9b b2 66 7d}
 The file 'd:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\HtmlDocument.PathMethods.cs' does not exist.
 Looking in script documents for 'd:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\HtmlDocument.PathMethods.cs'...
 Looking in the projects for 'd:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\HtmlDocument.PathMethods.cs'.
 The file was not found in a project.
 Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\crt\src\'...
 Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\crt\src\vccorlib\'...
 Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\src\mfc\'...
 Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\src\atl\'...
 Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\include'...
 Looking in directory 'C:\Users\sunil\Desktop\HtmlAgilityPack\Release\1_4_0\'...
 The debug source files settings for the active solution indicate that the debugger will not ask the user to find the file: d:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\HtmlDocument.PathMethods.cs.
 The debugger could not locate the source file 'd:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\HtmlDocument.PathMethods.cs'.

 If i selects NO - it asks me to select the source code again.

 I installed version 1.4.6.0
0
 
pothireddysunilAuthor Commented:
Hi All, its resolved. its the input file path access issue. Thanks
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now