• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 519
  • Last Modified:

Parsing HTML page information into database (copy/paste)

Hello everyone,

I need to parse out this web page information below and insert it into the database.  I think regular expression is the way to go.  All i want is to select the text with information I want and then copy and paste it into a textbox in my web application and then click import to import all of that information.  

Might give 500 if someone can help me with this. thanks.


  • 2
2 Solutions
Well, there are a several ways to do it.

Regular expressions could be one way to go, although it could prove messy to sort out so much information with regular expressions alone.  I would probably parse the information either using an XML parser or by using the DOM itself to extract the chunks of information you're interested in.  After that, regular expressions would be good to extract the rest.

What are you using to build your web application?

And, using an XML parser wouldn't work unless the page was XHTML compliant (which this page doesn't appear to be) -- you'd have to use DOM.
thiennhienAuthor Commented:
I am using ASP.NET web application to parse this.  I just want the user to select the data and copy/paste it in my web application.  IT would parse the information.  Another way is to view source and then copy/paste into my web app.  Could somebody gimme some code to start out with? Thanks.

It would be easy if you can figure out a fixed sequence of chars before and after the text you want to read...

for ex: <dsfa> dafs MY TEXT askdjkjhfasd

So if you want "MY TEXT" scan file contents for "<dsfa> dafs" and "askdjkjhfasd" and look for string in the middle..
this works most of the time.

Of course this is not a great way, but serves your purpose if you want a quick even though a dirty way...

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now