[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now


Parsing a HTML from the Internet with ASP.NET

Posted on 2009-07-07
Medium Priority
Last Modified: 2012-05-07
I need to load and parse an HTML file from a website.
I found a .NET project called "HTML Agility Pack" that lets me parse HTML easily, but not while they're still online. In other words, I can't specify an URI or URL as a file location (Same goes for the FileInfo constructor in C# Syste.IO namespace).

So instead, I'm guessing I have to download the file first, but I need to handle the download with server side code.

To put things into perspective, I am building a web service that must generate an XML file from a HTML site thats full of dropdowns (the webservice will be used internally by the university that have asked me to do this for them). I can't have direct access to the database from which the HTML page is getting its data, therefore the cumbersome workaround.

What the best way to do this? Thank you.
Question by:uhm179
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1

Accepted Solution

lharrispv earned 600 total points
ID: 24794970
HTTP Web Request.  Uses an HTTP Request to return the page for you and I think you might actually be able to get it to retrun in XML format already.

LVL 21

Expert Comment

ID: 24795037
well if you can't have access to the database, get in touch with the DB admin at the campus and have him create an interface/service that will allow you to get the data before you go through what they are asking...It will be tedious...
LVL 21

Expert Comment

ID: 24795057
by the way, I worked at an university where I wasn't allowed to touch the db, so I would go the route of gettin the DB admin to create API's for me to connect to retrieve the data as I just suggested.  That's more than reasonable...

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

LVL 33

Assisted Solution

by:Todd Gerbert
Todd Gerbert earned 600 total points
ID: 24795125
System.Net.WebRequest webRequest = System.Net.WebRequest.Create("http://www.server.com/page.htm");
System.Net.WebResponse webResponse = webRequest.GetResponse();
System.IO.StreamReader reader = new System.IO.StreamReader(webResponse.GetResponseStream());
string theHtml = reader.ReadToEnd();

Expert Comment

ID: 24795156
Hmm looks like the same thing I said..
LVL 33

Expert Comment

by:Todd Gerbert
ID: 24795177

I just hadn't seen your post yet.

Expert Comment

ID: 24795214
Sorry.. its just the last 3 or 4 posts I have made to this group I have been the first poster and then someone came along later said the same thing in a different way and wound up getting the points....makes it hard to get your months quota :-\

Author Closing Comment

ID: 31600765
I'm gonna split the points between iharrispv and tgerbert.

iharrispv, I would've assigned all the points to you but the page you linked to contained a bunch of code that I'd have to pick apart first to find exactly what I was looking for. tgerbert provided the bit of code that gave me an idea of what kind of code I had to look out for. Had you combined the link with a quick code example (due to the nature of the link), then it would have been perfect.
I don't know the nature of your other posts, but in this specific example, I can easily imagine someone else giving full points to tgerbert simply because many may prefer code over links (to code).

hehe, yah I'd prefer direct access to the database, but my chef doesn't like the idea. I guess he has his reasons.

Thank you for your help guys.

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Building a website can seem like a daunting task to the uninitiated but it really only requires knowledge of two basic languages: HTML and CSS.
Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
In this tutorial viewers will learn how to embed an audio file in a webpage using HTML5. Ensure your DOCTYPE declaration is set to HTML5: : The declaration should display (CODE) HTML5 is supported by the most recent versions of all major browsers…
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

649 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question