yongyih
asked on
How to copy all the text display in AxWebBrowser?
Hi
What I want to know is after I called this function. web.Navigate2(txtURL.Text) . How do I know?
1. When the loading of the page is completed ?
2. How to copy all the text display in AxWebBrowser control and save to database? (All done automatically by program) It is better that the AxWebBrowser is declare as variable to do this task instead of draw it on vb form.
I tried to use the code below but my vb.net cant find those class and functions.
(For question 2.)
Dim objRange as IHTMLTxtRange
set objRange = ActveDocument.body.createT extRange
objRange.select
Thanks first for your help.
What I want to know is after I called this function. web.Navigate2(txtURL.Text)
1. When the loading of the page is completed ?
2. How to copy all the text display in AxWebBrowser control and save to database? (All done automatically by program) It is better that the AxWebBrowser is declare as variable to do this task instead of draw it on vb form.
I tried to use the code below but my vb.net cant find those class and functions.
(For question 2.)
Dim objRange as IHTMLTxtRange
set objRange = ActveDocument.body.createT
objRange.select
Thanks first for your help.
ASKER
Hi juravich
Thanks for your help. I think you have answered my first question.
For my second question. It is very difficult to clean up all tags in html file for certain web site. I think the safest and easiest way is to copy all the text display in the browser and save it to a file. You can try to go to any web site, press Ctrl+A, Ctrl+C and then paste it to a text file.
That wil be what I want from the html file. I want this to be done by program automatically.
I hope it is possible to do that in vb.net. ^_^ Any idea about this code that I get from a web site.
Dim objRange as IHTMLTxtRange
set objRange = ActveDocument.body.createT extRange
objRange.select
Thanks.
Thanks for your help. I think you have answered my first question.
For my second question. It is very difficult to clean up all tags in html file for certain web site. I think the safest and easiest way is to copy all the text display in the browser and save it to a file. You can try to go to any web site, press Ctrl+A, Ctrl+C and then paste it to a text file.
That wil be what I want from the html file. I want this to be done by program automatically.
I hope it is possible to do that in vb.net. ^_^ Any idea about this code that I get from a web site.
Dim objRange as IHTMLTxtRange
set objRange = ActveDocument.body.createT
objRange.select
Thanks.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
p.s.
make sure you have this
Imports System.IO
at the top of our code
make sure you have this
Imports System.IO
at the top of our code
ASKER
Hi juravich,
Thank You. That is what I want. One more thing I need to do is add the reference. "Add reference->.Net
->Microsoft.mshtml"
Currently the AxWebBrowser() is dispay in my form. Is it possible to declare it as a variable in a function, then call navigate2 and get the body text ?
Thanks again for your help.
Thank You. That is what I want. One more thing I need to do is add the reference. "Add reference->.Net
->Microsoft.mshtml"
Currently the AxWebBrowser() is dispay in my form. Is it possible to declare it as a variable in a function, then call navigate2 and get the body text ?
Thanks again for your help.
'First, Make sure you declare these, brWeb is the name of my AxWebBrowser component
Private WithEvents doc As SHDocVw.DWebBrowserEvents_
Dim b As Object = brWeb.Application
doc = DirectCast(b, SHDocVw.WebBrowser_V1)
'This will navigate to the page
brWeb.Navigate2("http://www.google.com")
'This will pause your code right here in the form until the page that you declared in navigate2 is done loading
Do While brWeb.ReadyState <> SHDocVw.tagREADYSTATE.READ
Application.DoEvents()
Loop
'My suggestion here is to save the html to a local text file...you could also just save it to a string instead
Dim oDoc As mshtml.HTMLDocument = brWeb.Document
Dim MyHtml = oDoc.body.innerHTML
'Now from here you have to remove tags and whatnot to clean up the html for reading. This can get pretty complex if
'you have an extremely dynamic page.
'Search for the opening and closing tags you want
Start = txtFile.Text.IndexOf("<H4>
Stop = txtFile.Text.IndexOf("</H4
'Then Just put them together
MyValue = MyHtml .Substring(start, stop)
So this is the basic idea on how to do all the stuff that you need. Searching for tags gets kinda crazy, if you are having a really tough time, find some code that converts html to xml, since its a lot easier to deal with. let me know if you need any help.