pothireddysunil
asked on
How can I read an html page using HtmlAgilityPack in C#
I am trying to read an HTML page ( .htm page) , which is on my local drive using HtmlAgilityPack in C#.
Here are the things which i did.
1. Using Visual Studio 2012, first i installed HtmlAgilityPack using Package Manager Console -- NuGet.
2. It added HtmlAgilityPack dll to my project.
3. Here is my code. I started running my code in debug mode, when it reached the below line
HtmlAgilityPack.HtmlDocume nt doc = new HtmlAgilityPack.HtmlDocume nt();
4. I got an error saying that
No Source available
There no source code available for the current location.
I am confused here. What source code it is looking for and Why it is looking for the source code as we already have the dll attached to the project
So, here are my questions on this issue
1. What is this error means. If it is looking for the source code, how can i get it.
2. How can i get the same source code for the HtmlAgilityPack which it was installed
3. how can I make it available to my application
4. how can i read the html tables
or is there any different approach that i can use to read the tables on the html page
try
{
DirectoryInfo theFolder = new DirectoryInfo("\\\\MYPC\\U sers\\Desk top");
System.IO.FileInfo[] file = theFolder.GetFiles();
int len = file.Length;
if (file.Length > 1)
{
int intLength;
fileName = Convert.ToString(file.GetV alue(0));
intLength = fileName.IndexOf("_");
}
string FileName = "\\\\MYPC\\Users\\Desktop" + fileName;
// Load the html document
HtmlAgilityPack.HtmlDocume nt doc = new HtmlAgilityPack.HtmlDocume nt();
doc.Load(FileName);
// Get all tables in the document
HtmlNodeCollection tables = doc.DocumentNode.SelectNod es("//TABL E");
HtmlNodeCollection rows = tables(0).SelectNodes(".// TR");
for (int i = 0; i < rows.Count; ++i)
{
HtmlNodeCollection cols = rows(i).SelectNodes(".//TD ");
for (int j = 0; j < cols.Count; ++j)
{
// Get the value of the column and print it
string value = cols(j).InnerText;
Console.WriteLine(value);
}
}
}
catch (Exception objError)
{
throw objError;
}
Here are the things which i did.
1. Using Visual Studio 2012, first i installed HtmlAgilityPack using Package Manager Console -- NuGet.
2. It added HtmlAgilityPack dll to my project.
3. Here is my code. I started running my code in debug mode, when it reached the below line
HtmlAgilityPack.HtmlDocume
4. I got an error saying that
No Source available
There no source code available for the current location.
I am confused here. What source code it is looking for and Why it is looking for the source code as we already have the dll attached to the project
So, here are my questions on this issue
1. What is this error means. If it is looking for the source code, how can i get it.
2. How can i get the same source code for the HtmlAgilityPack which it was installed
3. how can I make it available to my application
4. how can i read the html tables
or is there any different approach that i can use to read the tables on the html page
try
{
DirectoryInfo theFolder = new DirectoryInfo("\\\\MYPC\\U
System.IO.FileInfo[] file = theFolder.GetFiles();
int len = file.Length;
if (file.Length > 1)
{
int intLength;
fileName = Convert.ToString(file.GetV
intLength = fileName.IndexOf("_");
}
string FileName = "\\\\MYPC\\Users\\Desktop"
// Load the html document
HtmlAgilityPack.HtmlDocume
doc.Load(FileName);
// Get all tables in the document
HtmlNodeCollection tables = doc.DocumentNode.SelectNod
HtmlNodeCollection rows = tables(0).SelectNodes(".//
for (int i = 0; i < rows.Count; ++i)
{
HtmlNodeCollection cols = rows(i).SelectNodes(".//TD
for (int j = 0; j < cols.Count; ++j)
{
// Get the value of the column and print it
string value = cols(j).InnerText;
Console.WriteLine(value);
}
}
}
catch (Exception objError)
{
throw objError;
}
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
It asks me to select the source code and once i selects it gives me this alert.
Source file:
C:......\HtmlDocument.cs
Module: c:\users........\Debug\Htm lAgilityPa cl.dll
Process:,,.exe
The source file is different from when the module was built. Would you like to use it anyway?
It gives this alert for couple of ore classes and finally throws the below error.
Source file information;
Locating source for 'd:\Source\htmlagilitypack .new\Trunk \HtmlAgili tyPack\Htm lDocument. PathMethod s.cs'. Checksum: MD5 {21 f3 9f 31 c1 6a 76 67 a7 c1 d8 6f 9b b2 66 7d}
The file 'd:\Source\htmlagilitypack .new\Trunk \HtmlAgili tyPack\Htm lDocument. PathMethod s.cs' does not exist.
Looking in script documents for 'd:\Source\htmlagilitypack .new\Trunk \HtmlAgili tyPack\Htm lDocument. PathMethod s.cs'...
Looking in the projects for 'd:\Source\htmlagilitypack .new\Trunk \HtmlAgili tyPack\Htm lDocument. PathMethod s.cs'.
The file was not found in a project.
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\crt\src\'...
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\crt\src\vccorlib\' ...
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\src\mfc\'.. .
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\src\atl\'.. .
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\include'...
Looking in directory 'C:\Users\sunil\Desktop\Ht mlAgilityP ack\Releas e\1_4_0\'. ..
The debug source files settings for the active solution indicate that the debugger will not ask the user to find the file: d:\Source\htmlagilitypack. new\Trunk\ HtmlAgilit yPack\Html Document.P athMethods .cs.
The debugger could not locate the source file 'd:\Source\htmlagilitypack .new\Trunk \HtmlAgili tyPack\Htm lDocument. PathMethod s.cs'.
If i selects NO - it asks me to select the source code again.
I installed version 1.4.6.0
Source file:
C:......\HtmlDocument.cs
Module: c:\users........\Debug\Htm
Process:,,.exe
The source file is different from when the module was built. Would you like to use it anyway?
It gives this alert for couple of ore classes and finally throws the below error.
Source file information;
Locating source for 'd:\Source\htmlagilitypack
The file 'd:\Source\htmlagilitypack
Looking in script documents for 'd:\Source\htmlagilitypack
Looking in the projects for 'd:\Source\htmlagilitypack
The file was not found in a project.
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\crt\src\'...
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\crt\src\vccorlib\'
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\src\mfc\'..
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\src\atl\'..
Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\atlmfc\include'...
Looking in directory 'C:\Users\sunil\Desktop\Ht
The debug source files settings for the active solution indicate that the debugger will not ask the user to find the file: d:\Source\htmlagilitypack.
The debugger could not locate the source file 'd:\Source\htmlagilitypack
If i selects NO - it asks me to select the source code again.
I installed version 1.4.6.0
ASKER
Hi All, its resolved. its the input file path access issue. Thanks
ASKER