Solved

HTML Parser in VB .NET

Posted on 2004-09-26
4
904 Views
Last Modified: 2012-05-05
I have to write a program in VB .NET that would look at a html and text files, parse out specific information and then deposit it in a database. This would not be a problem if it was something specific like an e-mail address that you could look at the @ sign for example. Not all the pages look exactly the same, not all have the same format, and the data that I am looking for is just numbers. An example would be to find a person's salary on the page.

As a human, I would look around on the page, look for references of "salary", then reference it that way.

Any ideas ? Information ?

Where to start?

Regex maybe?
0
Comment
Question by:waterzap
  • 2
4 Comments
 
LVL 70

Accepted Solution

by:
Éric Moreau earned 500 total points
ID: 12156043
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 12157036
If it is simple, you might also be able to get the HTML text, and use simple Regular Expressions to parse.  The HTML Document class is a fairly hefty chunk of real estate that is like squirrel hunting with an elephant rifle.

Bob
0
 
LVL 18

Expert Comment

by:armoghan
ID: 12160371
If you need to find MSHTML.. Its not 2005
Add Reference -> .NET -> Microsoft MSHTML -> Select
0
 
LVL 18

Expert Comment

by:armoghan
ID: 12160380
opps ... sorry wrote in the wrong window
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Recently while returning home from work my wife (another .NET developer) was murmuring something. On further poking she said that she has been assigned a task where she has to serialize and deserialize objects and she is afraid of serialization. Wha…
In my previous article (http://www.experts-exchange.com/Programming/Languages/.NET/.NET_Framework_3.x/A_4362-Serialization-in-NET-1.html) we saw the basics of serialization and how types/objects can be serialized to Binary format. In this blog we wi…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question