esak2000
asked on
c# extract html from site with console program
I'm using a c# program to extract to a string the following page:
http://signssafety.com/signsafety/ProductDescription.aspx?productID=7
and I'm using the following code:
When I view the downloaded Items.html file I see that the the actual page that was downloaded was:
"http://www.signssafety.com/signsafety"
and not the page in the link above.
I want to continue using the c# console program, and don't want to use the WebBrowser object. Does anyone know what I'm doing wrong or what can be done using the C# console to download the actual page?
http://signssafety.com/signsafety/ProductDescription.aspx?productID=7
and I'm using the following code:
urlItem = "http://signssafety.com/signsafety/ProductDescription.aspx?productID=7"
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(urlItem);
request.UserAgent = "Foo";
request.ContentType = "text/html; charset=UTF-8";
Encoding wind1252 = Encoding.GetEncoding(1252);
request.UseDefaultCredentials = true;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader myStreamReader = new streamReader(response.GetResponseStream(), wind1252);
string responseString = myStreamReader.ReadToEnd();
request.Abort();
StreamWriter swwrite = new StreamWriter(@"Items.html");
swwrite.Write(responseString);
swwrite.Close();
When I view the downloaded Items.html file I see that the the actual page that was downloaded was:
"http://www.signssafety.com/signsafety"
and not the page in the link above.
I want to continue using the c# console program, and don't want to use the WebBrowser object. Does anyone know what I'm doing wrong or what can be done using the C# console to download the actual page?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
my comment was what the better solution for what I wanted
For your needs, you could do something like:
Open in new window
HAP provides both LINQ and XPath mechanisms for extracting data from HTML. Both of these would be more reliable in terms of locating data within the HTML source than would straight string searching.