Mike Eghtebas
asked on
c# using HtmlAgilityPack or xpath ... html data
I have test1.html in a folder and want to capture its embedded data shown on the image below:The code I have now here is not producing what I need. Please see below for the code and the result it is producing:
HtmlDocument hdoc = new HtmlDocument();
HtmlNode abcTable;
HtmlNodeCollection tableCells;
hdoc.Load("test1.html");
abcTable = hdoc.DocumentNode.SelectSingleNode("//table[@summary='ABC']");
tableCells = abcTable.SelectNodes("//td");
foreach (HtmlNode cell in tableCells)
{
Response.Write(cell.InnerText);
}
The result using the above code:
Field1 Value1 Field6 Value6 Field2 Value2 Field7 Value7 Field3 Value3 Field8 Value8 Field4 Value4 Field9 Value9 Field5 Value5 Field10 Value10
Field11 Field12 Field13 Field14 Field15 Field16 Field17 Field18 Field19
Valu11 Valu12 Valu13 Valu16 Valu17
Field1 Value1 Field6 Value6 Field2 Value2 Field7 Value7 Field3 Value3 Field8 Value8 Field4 Value4 Field9 Value9 Field5 Value5 Field10 Value10
Field1 Value1 Field6 Value6 Field2 Value2 Field7 Value7 Field3 Value3 Field8 Value8 Field4 Value4 Field9 Value9 Field5 Value5 Field10 Value10
Field11 Field12 Field13 Field14 Field15 Field16 Field17 Field18 Field19
Valu11 Valu12 Valu13 Valu16 Valu17
ASKER
Thanks for the post. I will try your post shortly. But in case there is a need, here is the html file:
<table summary="TopData">
<tr> <td> Field1 </td> <td>Value1</td> <td> Field6 </td> <td>Value6</td></tr>
<tr> <td > Field2 </td> <td>Value2</td> <td> Field7 </td> <td>Value7</td></tr></tr>
<tr> <td > Field3 </td> <td>Value3</td> <td> Field8 </td> <td>Value8</td></tr></tr>
<tr> <td > Field4 </td> <td>Value4</td> <td> Field9 </td> <td>Value9</td></tr></tr>
<tr> <td > Field5 </td> <td>Value5</td> <td> Field10 </td> <td>Value10</td></tr></tr>
</tr>
</table>
<table summary="Details">
<tr> <td> Field11 </td><td> Field12 </td><td> Field13 </td><td> Field14 </td><td> Field15 </td><td> Field16 </td><td> Field17 </td> <td> Field18 </td><td> Field19 </td></tr>
<tr> <td> Value11 </td><td> Value12 </td><td> Value13 </td><td> </td><td> </td><td> Field16 </td><td> Field17 </td> <td> </td><td> </td></tr>
<tr> <td>Value11 </td><td>Value12 </td><td> Value13</td><td>Value14</td><td>Value15</td><td> Field16 </td><td> Value17</td> <td> </td><td> </td></tr>
</table>
ASKER
I am getting the following error because I have changed Console to Response:
Someday I will get hang of it all.
Thank you for the help.
Mike
// Console.Write("{0} | {1} |", rowCells[i].InnerText, rowCells[i + 1].InnerText);
Response.Write("{0} | {1} |", rowCells[i].InnerText, rowCells[i + 1].InnerText)
(1) (2) (2)
(1): cannot convert from 'string' to 'char[]'
(2): cannot convert from 'string' to 'int'
(3): cannot convert from 'string' to 'int'
Someday I will get hang of it all.
Thank you for the help.
Mike
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Fantastic.
e.g.
Open in new window