Solved

HTML Parser in VB .NET

Posted on 2004-09-26
4
898 Views
Last Modified: 2012-05-05
I have to write a program in VB .NET that would look at a html and text files, parse out specific information and then deposit it in a database. This would not be a problem if it was something specific like an e-mail address that you could look at the @ sign for example. Not all the pages look exactly the same, not all have the same format, and the data that I am looking for is just numbers. An example would be to find a person's salary on the page.

As a human, I would look around on the page, look for references of "salary", then reference it that way.

Any ideas ? Information ?

Where to start?

Regex maybe?
0
Comment
Question by:waterzap
  • 2
4 Comments
 
LVL 69

Accepted Solution

by:
Éric Moreau earned 500 total points
ID: 12156043
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 12157036
If it is simple, you might also be able to get the HTML text, and use simple Regular Expressions to parse.  The HTML Document class is a fairly hefty chunk of real estate that is like squirrel hunting with an elephant rifle.

Bob
0
 
LVL 18

Expert Comment

by:armoghan
ID: 12160371
If you need to find MSHTML.. Its not 2005
Add Reference -> .NET -> Microsoft MSHTML -> Select
0
 
LVL 18

Expert Comment

by:armoghan
ID: 12160380
opps ... sorry wrote in the wrong window
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

IP addresses can be stored in a database in any of several ways.  These ways may vary based on the volume of the data.  I was dealing with quite a large amount of data for user authentication purpose, and needed a way to minimize the storage.   …
More often than not, we developers are confronted with a need: a need to make some kind of magic happen via code. Whether it is for a client, for the boss, or for our own personal projects, the need must be satisfied. Most of the time, the Framework…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now