Link to home
Start Free TrialLog in
Avatar of yongyih
yongyihFlag for Malaysia

asked on

How to copy all the text display in AxWebBrowser?

Hi          

  What I want to know is after I called this function.  web.Navigate2(txtURL.Text).  How do I know?

1. When the loading of the page is completed ?
2. How to copy all the text display in AxWebBrowser control and save to database? (All done automatically by program)  It is better that the AxWebBrowser is declare as variable to do this task instead of draw it on vb form.

  I tried to use the code below but my vb.net cant find those class and functions.
(For question 2.)
Dim objRange as IHTMLTxtRange
set objRange = ActveDocument.body.createTextRange
objRange.select

  Thanks first for your help.
Avatar of juravich
juravich

Hello there,

'First, Make sure you declare these, brWeb is the name of my AxWebBrowser component
Private WithEvents doc As SHDocVw.DWebBrowserEvents_Event
Dim b As Object = brWeb.Application
doc = DirectCast(b, SHDocVw.WebBrowser_V1)

'This will navigate to the page
brWeb.Navigate2("http://www.google.com")

'This will pause your code right here in the form until the page that you declared in navigate2 is done loading
Do While brWeb.ReadyState <> SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE
                    Application.DoEvents()
Loop


'My suggestion here is to save the html to a local text file...you could also just save it to a string instead

 Dim oDoc As mshtml.HTMLDocument = brWeb.Document
 Dim MyHtml = oDoc.body.innerHTML


'Now from here you have to remove tags and whatnot to clean up the html for reading. This can get pretty complex if
'you have an extremely dynamic page.

'Search for the opening and closing tags you want

Start = txtFile.Text.IndexOf("<H4>")
Stop = txtFile.Text.IndexOf("</H4>")

'Then Just put them together

MyValue = MyHtml .Substring(start, stop)


So this is the basic idea on how to do all the stuff that you need. Searching for tags gets kinda crazy, if you are having a really tough time, find some code that converts html to xml, since its a lot easier to deal with. let me know if you need any help.
Avatar of yongyih

ASKER

Hi juravich

  Thanks for your help.  I think you have answered my first question.

  For my second question.  It is very difficult to clean up all tags in html file for certain web site.  I think the safest and easiest way is to copy all the text display in the browser and save it to a file.  You can try to go to any web site, press Ctrl+A, Ctrl+C and then paste it to a text file.
That wil be what I want from the html file.  I want this to be done by program automatically.

  I hope it is possible to do that in vb.net. ^_^  Any idea about this code that I get from a web site.

Dim objRange as IHTMLTxtRange
set objRange = ActveDocument.body.createTextRange
objRange.select

Thanks.



ASKER CERTIFIED SOLUTION
Avatar of juravich
juravich

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
p.s.

make sure you have this

Imports System.IO

at the top of our code
Avatar of yongyih

ASKER

Hi juravich,

  Thank You.  That is what I want.  One more thing I need to do is add the reference.  "Add reference->.Net
->Microsoft.mshtml"

  Currently the AxWebBrowser() is dispay in my form.  Is it possible to declare it as a variable in a function, then call navigate2 and get the body text ?

  Thanks again for your help.