Parsing HTML page information into database (copy/paste)

Posted on 2005-04-13
Last Modified: 2008-01-09
Hello everyone,

I need to parse out this web page information below and insert it into the database.  I think regular expression is the way to go.  All i want is to select the text with information I want and then copy and paste it into a textbox in my web application and then click import to import all of that information.  

Might give 500 if someone can help me with this. thanks.

Question by:thiennhien
    LVL 5

    Accepted Solution

    Well, there are a several ways to do it.

    Regular expressions could be one way to go, although it could prove messy to sort out so much information with regular expressions alone.  I would probably parse the information either using an XML parser or by using the DOM itself to extract the chunks of information you're interested in.  After that, regular expressions would be good to extract the rest.

    What are you using to build your web application?

    LVL 5

    Expert Comment

    And, using an XML parser wouldn't work unless the page was XHTML compliant (which this page doesn't appear to be) -- you'd have to use DOM.

    Author Comment

    I am using ASP.NET web application to parse this.  I just want the user to select the data and copy/paste it in my web application.  IT would parse the information.  Another way is to view source and then copy/paste into my web app.  Could somebody gimme some code to start out with? Thanks.

    LVL 12

    Assisted Solution

    It would be easy if you can figure out a fixed sequence of chars before and after the text you want to read...

    for ex: <dsfa> dafs MY TEXT askdjkjhfasd

    So if you want "MY TEXT" scan file contents for "<dsfa> dafs" and "askdjkjhfasd" and look for string in the middle..
    this works most of the time.

    Of course this is not a great way, but serves your purpose if you want a quick even though a dirty way...

    Featured Post

    Highfive + Dolby Voice = No More Audio Complaints!

    Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

    Join & Write a Comment

    Suggested Solutions

    Here we come across an interesting topic of coding guidelines while designing automation test scripts. The scope of this article will not be limited to QTP but to an overall extent of using VB Scripting for automation projects. Introduction Now…
    A short article about problems I had with the new location API and permissions in Marshmallow
    In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
    In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

    732 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    24 Experts available now in Live!

    Get 1:1 Help Now