How to copy all the text display in AxWebBrowser?

Posted on 2006-03-24
Last Modified: 2008-04-05

  What I want to know is after I called this function.  web.Navigate2(txtURL.Text).  How do I know?

1. When the loading of the page is completed ?
2. How to copy all the text display in AxWebBrowser control and save to database? (All done automatically by program)  It is better that the AxWebBrowser is declare as variable to do this task instead of draw it on vb form.

  I tried to use the code below but my cant find those class and functions.
(For question 2.)
Dim objRange as IHTMLTxtRange
set objRange = ActveDocument.body.createTextRange

  Thanks first for your help.
Question by:yongyih
    LVL 2

    Expert Comment

    Hello there,

    'First, Make sure you declare these, brWeb is the name of my AxWebBrowser component
    Private WithEvents doc As SHDocVw.DWebBrowserEvents_Event
    Dim b As Object = brWeb.Application
    doc = DirectCast(b, SHDocVw.WebBrowser_V1)

    'This will navigate to the page

    'This will pause your code right here in the form until the page that you declared in navigate2 is done loading
    Do While brWeb.ReadyState <> SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE

    'My suggestion here is to save the html to a local text could also just save it to a string instead

     Dim oDoc As mshtml.HTMLDocument = brWeb.Document
     Dim MyHtml = oDoc.body.innerHTML

    'Now from here you have to remove tags and whatnot to clean up the html for reading. This can get pretty complex if
    'you have an extremely dynamic page.

    'Search for the opening and closing tags you want

    Start = txtFile.Text.IndexOf("<H4>")
    Stop = txtFile.Text.IndexOf("</H4>")

    'Then Just put them together

    MyValue = MyHtml .Substring(start, stop)

    So this is the basic idea on how to do all the stuff that you need. Searching for tags gets kinda crazy, if you are having a really tough time, find some code that converts html to xml, since its a lot easier to deal with. let me know if you need any help.
    LVL 2

    Author Comment

    Hi juravich

      Thanks for your help.  I think you have answered my first question.

      For my second question.  It is very difficult to clean up all tags in html file for certain web site.  I think the safest and easiest way is to copy all the text display in the browser and save it to a file.  You can try to go to any web site, press Ctrl+A, Ctrl+C and then paste it to a text file.
    That wil be what I want from the html file.  I want this to be done by program automatically.

      I hope it is possible to do that in ^_^  Any idea about this code that I get from a web site.

    Dim objRange as IHTMLTxtRange
    set objRange = ActveDocument.body.createTextRange


    LVL 2

    Accepted Solution

    hey there,

    Sorry about that I misread your question.

    The solution to the problem you have is rather simple actually.

    Simply Change this

    Dim oDoc As mshtml.HTMLDocument = brWeb.Document
    Dim MyHtml = oDoc.body.innerHTML


    Dim oDoc As mshtml.HTMLDocument = brWeb.Document
    Dim objStreamWriter As StreamWriter
    objStreamWriter = New StreamWriter("C:\Test.txt")


    This code will take the text of the webpage and save it to a text file located at "C:\Test.txt", let me know if you need anymore help
    LVL 2

    Expert Comment


    make sure you have this

    Imports System.IO

    at the top of our code
    LVL 2

    Author Comment

    Hi juravich,

      Thank You.  That is what I want.  One more thing I need to do is add the reference.  "Add reference->.Net

      Currently the AxWebBrowser() is dispay in my form.  Is it possible to declare it as a variable in a function, then call navigate2 and get the body text ?

      Thanks again for your help.  

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    IT, Stop Being Called Into Every Meeting

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Article by: jpaulino
    XML Literals are a great way to handle XML files and the community doesn’t use it as much as it should.  An XML Literal is like a String ( Literal, only instead of starting and ending with w…
    Well, all of us have seen the multiple EXCEL.EXE's in task manager that won't die even if you call the .close, .dispose methods. Try this method to kill any excels in memory. You can copy the kill function to create a check function and replace the …
    how to add IIS SMTP to handle application/Scanner relays into office 365.
    Internet Business Fax to Email Made Easy - With eFax Corporate (, you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…

    759 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    9 Experts available now in Live!

    Get 1:1 Help Now