Solved

c# extract html from site with console program

Posted on 2013-11-07
3
334 Views
Last Modified: 2013-11-12
I'm using a c# program to extract to a string the following page:

http://signssafety.com/signsafety/ProductDescription.aspx?productID=7

and I'm using the following code:

urlItem = "http://signssafety.com/signsafety/ProductDescription.aspx?productID=7"
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(urlItem);
request.UserAgent = "Foo";
request.ContentType = "text/html; charset=UTF-8";
Encoding wind1252 = Encoding.GetEncoding(1252);
request.UseDefaultCredentials = true;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
                        StreamReader myStreamReader = new streamReader(response.GetResponseStream(), wind1252);
 string responseString = myStreamReader.ReadToEnd();
 request.Abort();
StreamWriter swwrite = new StreamWriter(@"Items.html");
swwrite.Write(responseString);
swwrite.Close();

Open in new window


When I view the downloaded Items.html file I see that the the actual page that was downloaded was:

"http://www.signssafety.com/signsafety"

and not the page in the link above.

I want to continue using the c# console program, and don't want to use the WebBrowser object. Does anyone know what I'm doing wrong or what can be done using the C# console to download the actual page?
0
Comment
Question by:esak2000
  • 2
3 Comments
 
LVL 74

Expert Comment

by:käµfm³d 👽
Comment Utility
I would suggest using the HTML Agility Pack (available through NuGet also) if you are going to be parsing HTML. It is very flexible in terms of handling various qualities of HTML.

For your needs, you could do something like:

HtmlAgilityPack.HtmlWeb client = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = client.Load("http://signssafety.com/signsafety/ProductDescription.aspx?productID=7");

doc.Save("Items.html");

Open in new window


HAP provides both LINQ and XPath mechanisms for extracting data from HTML. Both of these would be more reliable in terms of locating data within the HTML source than would straight string searching.
0
 

Accepted Solution

by:
esak2000 earned 0 total points
Comment Utility
Thanks for the tip. In the end I used the internet explorer object to download the html files to my local computer and used stream reader to read the files.
0
 

Author Closing Comment

by:esak2000
Comment Utility
my comment was what the better solution for what I wanted
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Do you come here a lot? Are you lazy like me and don't want to go through the "trouble" of having to click your Dock's Safari icon and then having to click your Experts Exchange Favorites bookmark to get here? Well then this article is for you.
Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
Shows how to create a shortcut to site-search Experts Exchange using Google in the Chrome browser. This eliminates the need to type out site:experts-exchange.com whenever you want to search the site. Launch the Search Engine Menu: In chrome, via you…
How to create a custom search shortcut to site-search Experts Exchange using Google in the Firefox browser. This eliminates the need to type out site:experts-exchange.com whenever you want to search the site. Launch your Bookmark Menu: Press 'Ctrl +…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now