Parsing a HTML from the Internet with ASP.NET

Posted on 2009-07-07
Last Modified: 2012-05-07
I need to load and parse an HTML file from a website.
I found a .NET project called "HTML Agility Pack" that lets me parse HTML easily, but not while they're still online. In other words, I can't specify an URI or URL as a file location (Same goes for the FileInfo constructor in C# Syste.IO namespace).

So instead, I'm guessing I have to download the file first, but I need to handle the download with server side code.

To put things into perspective, I am building a web service that must generate an XML file from a HTML site thats full of dropdowns (the webservice will be used internally by the university that have asked me to do this for them). I can't have direct access to the database from which the HTML page is getting its data, therefore the cumbersome workaround.

What the best way to do this? Thank you.
Question by:uhm179
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1

Accepted Solution

lharrispv earned 150 total points
ID: 24794970
HTTP Web Request.  Uses an HTTP Request to return the page for you and I think you might actually be able to get it to retrun in XML format already.
LVL 21

Expert Comment

ID: 24795037
well if you can't have access to the database, get in touch with the DB admin at the campus and have him create an interface/service that will allow you to get the data before you go through what they are asking...It will be tedious...
LVL 21

Expert Comment

ID: 24795057
by the way, I worked at an university where I wasn't allowed to touch the db, so I would go the route of gettin the DB admin to create API's for me to connect to retrieve the data as I just suggested.  That's more than reasonable...
Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

LVL 33

Assisted Solution

by:Todd Gerbert
Todd Gerbert earned 150 total points
ID: 24795125
System.Net.WebRequest webRequest = System.Net.WebRequest.Create("");
System.Net.WebResponse webResponse = webRequest.GetResponse();
System.IO.StreamReader reader = new System.IO.StreamReader(webResponse.GetResponseStream());
string theHtml = reader.ReadToEnd();

Expert Comment

ID: 24795156
Hmm looks like the same thing I said..
LVL 33

Expert Comment

by:Todd Gerbert
ID: 24795177

I just hadn't seen your post yet.

Expert Comment

ID: 24795214
Sorry.. its just the last 3 or 4 posts I have made to this group I have been the first poster and then someone came along later said the same thing in a different way and wound up getting the points....makes it hard to get your months quota :-\

Author Closing Comment

ID: 31600765
I'm gonna split the points between iharrispv and tgerbert.

iharrispv, I would've assigned all the points to you but the page you linked to contained a bunch of code that I'd have to pick apart first to find exactly what I was looking for. tgerbert provided the bit of code that gave me an idea of what kind of code I had to look out for. Had you combined the link with a quick code example (due to the nature of the link), then it would have been perfect.
I don't know the nature of your other posts, but in this specific example, I can easily imagine someone else giving full points to tgerbert simply because many may prefer code over links (to code).

hehe, yah I'd prefer direct access to the database, but my chef doesn't like the idea. I guess he has his reasons.

Thank you for your help guys.

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Session on Html 8 42
Return array 3 20
w3c parsing errors 4 20
How do I Import CSV File In my PHP Application 29 18
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Today, the web development industry is booming, and many people consider it to be their vocation. The question you may be asking yourself is – how do I become a web developer?
In this Micro Tutorial viewers will learn how to create navigation buttons that change on rollover, using CSS (Continuation of the CSS Image Sprite tutorial) Create a parent ID for all the list items       - Specify position: absolute and display: block…
In this tutorial viewers will learn how to style a corner ribbon overlay for an image using CSS Create a new class by typing ".Ribbon":  Define the class' "display:" as "inline-block": Define its "position:" as "relative": Define its "overflow:" as …

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question