MSHTML parsing WITHOUT WebBrowser or InternetExplorer

I have a vb.net service that downloads a bit of HTML from a site using httpwebrequest and httpwebresponse.  Through a series of steps I end up with a string variable that has the HTML for my target page, which contains, amoung other things, a table of data I want to programatically add to my database.  

I have done this before in vb6 and I don't want to learn XML parsing techniques to do this, and I want to reuse the design from VB6.

I don't see a way to add a WebBrowser or InternetExplorer control to a windows service.

What I want, is a way to pass the HTML I have in the string to the MSHTML.htmldocument interface so I can subsequently use the HTMLdocument interface to extract the table data programatically.

Any pointers?

LVL 2
tmesiasAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Bob LearnedCommented:
Have you started down the mshtml.HtmlDocument road yet?  Do you have VB6 code that works?

Bob
Erick37Commented:
You can feed the string directly to an IHTMLDocument2 object like this:

        Dim oDoc As New mshtml.HTMLDocument
        Dim iDoc As mshtml.IHTMLDocument2 = oDoc

        'write to the IHTMLDocument2
        iDoc.write("<html><body>Hello</body></html>")
        iDoc.close()

        'get it back to an HTMLDocument
        oDoc = iDoc

        'now you can use the HTMLDocument to parse and extract
        Debug.WriteLine(oDoc.body.innerHTML)

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
tmesiasAuthor Commented:
Eric, Thanks for the response.  This way works.  My final code looks a little bit differently and works the same:

<bunch of code that uses HTTPwebrequest / httpwebresponse to get HTML in a string called strHTML>

        Dim docObject = New mshtml.HTMLDocumentClass

        Dim doc2 As mshtml.IHTMLDocument2
        doc2 = docObject

        doc2.write(strHTML)
        doc2.close

<bunch of code that extracts the table from the docObject so I can read it into DB>


and bob, yes now I can reuse my VB 6 parse logic in my new .net service.  The vb6 code as far as the MSHTML parsing works, I just wanted to eliminate using the webbrowser control...  it wouldn't scale the way I wanted it to.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic.NET

From novice to tech pro — start learning today.