thenone
asked on
IE Object
How would you use the internet explorer function to take an html page that is in a string and save the ouput as text in another string and put this into a function.
ASKER
The thing is I don't need to get the page I already have the page in a string.
then i miss the pooint, you already have the page in a string and now need to copy this to another string why do you need IE for that?
ASKER
I need IE to strip the html and just put it in another string as output.Or another great way of doing this.
not sure if you can load a string into the ie object directly saw Azrasound do that once in a thread in VB, but in a bit of hurry now so i'll check back later if there are no others commenting
ASKER
ok thanks
ASKER
I do have this function I tired modifying it because it cuases my program to not run smooth the hour glass on my cursor keeps coming on.
Public Function strStrip_HTML_Tags(ByVal strText As String) As String
Dim objInternetExplorer_Applic ation As Object
Dim strReturn As String
On Error GoTo Err_strStrip_HTML_Tags
Set objInternetExplorer_Applic ation = CreateObject("InternetExpl orer.Appli cation")
If Not (objInternetExplorer_Appli cation Is Nothing) Then
objInternetExplorer_Applic ation.Navi gate "about:blank"
objInternetExplorer_Applic ation.Appl ication.Do cument.Ope n
objInternetExplorer_Applic ation.Appl ication.Do cument.Wri te (strText)
objInternetExplorer_Applic ation.Appl ication.Do cument.Clo se
DoEvents
strReturn = objInternetExplorer_Applic ation.Docu ment.body. InnerText
objInternetExplorer_Applic ation.Quit
End If
Exit_strStrip_HTML_Tags:
On Error Resume Next
Set objInternetExplorer_Applic ation = Nothing
strStrip_HTML_Tags = strReturn
Exit Function
Err_strStrip_HTML_Tags:
On Error Resume Next
strReturn = ""
Resume Exit_strStrip_HTML_Tags
End Function
Public Function strStrip_HTML_Tags(ByVal strText As String) As String
Dim objInternetExplorer_Applic
Dim strReturn As String
On Error GoTo Err_strStrip_HTML_Tags
Set objInternetExplorer_Applic
If Not (objInternetExplorer_Appli
objInternetExplorer_Applic
objInternetExplorer_Applic
objInternetExplorer_Applic
objInternetExplorer_Applic
DoEvents
strReturn = objInternetExplorer_Applic
objInternetExplorer_Applic
End If
Exit_strStrip_HTML_Tags:
On Error Resume Next
Set objInternetExplorer_Applic
strStrip_HTML_Tags = strReturn
Exit Function
Err_strStrip_HTML_Tags:
On Error Resume Next
strReturn = ""
Resume Exit_strStrip_HTML_Tags
End Function
ASKER
Maybe a remodifaction of the above would work.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I think I was the original author of "strStrip_HTML_Tags()" [it certainly looks like something I would have written in a PAQ]. If I recall correctly, the question was asking to retrieve the contents of a referenced URL, not like you case where the page contents already exist in a string variable.
I would suggest looking at Regular Expressions, as the use of the "InternetExplorer.Applicat ion" object is overkill here.
BFN,
fp.
I would suggest looking at Regular Expressions, as the use of the "InternetExplorer.Applicat
BFN,
fp.
For example:
"how to remove html from excel spreadsheet"
[ https://www.experts-exchange.com/questions/21359240/how-to-remove-html-from-excel-spreadsheet.html ]
Using this function:
Function regExpReplace(strSource, strSearchPattern As String, strReplacePattern As String, Optional IgnoreCase As Boolean = True)
Dim regEx As Object
Set regEx = CreateObject("vbscript.reg exp")
regEx.Pattern = strSearchPattern
regEx.IgnoreCase = IgnoreCase
regEx.Global = True
regExpReplace = regEx.Replace(strSource, strReplacePattern)
End Function
Usage would be:
Dim strHTML As String
strHTML = "<a href=http://NigelLee.info>Click to view my page</a>"
MsgBox regExpReplace(strHTML, "<[^>]*>", "")
BFN,
fp.
"how to remove html from excel spreadsheet"
[ https://www.experts-exchange.com/questions/21359240/how-to-remove-html-from-excel-spreadsheet.html ]
Using this function:
Function regExpReplace(strSource, strSearchPattern As String, strReplacePattern As String, Optional IgnoreCase As Boolean = True)
Dim regEx As Object
Set regEx = CreateObject("vbscript.reg
regEx.Pattern = strSearchPattern
regEx.IgnoreCase = IgnoreCase
regEx.Global = True
regExpReplace = regEx.Replace(strSource, strReplacePattern)
End Function
Usage would be:
Dim strHTML As String
strHTML = "<a href=http://NigelLee.info>Click to view my page</a>"
MsgBox regExpReplace(strHTML, "<[^>]*>", "")
BFN,
fp.
ASKER
Fanpages great you are here I believe you were the one that helped me with this in the past.I want to be able to use the IE becasue it does remove every single kind of script from the html.Ive experimiented with alot thought it would work but it doesn't.I just want to really make sure that all tags are gone!!! Otherwise it will really mess up my program.Ive started over and over again really fustrated.
ASKER
For example if I have javascript perl etc inside of the page it looks like internet exoplorer does return anything to regular text.
Hi,
If you make your application available for download (or e-mail to me & I'll post on my web site for everyone else to review), then we can see if the hourglass issue is only evident in your environment or not.
BFN,
fp.
If you make your application available for download (or e-mail to me & I'll post on my web site for everyone else to review), then we can see if the hourglass issue is only evident in your environment or not.
BFN,
fp.
ASKER
no offense I would rather not do that I think my problem is the looping in what you had wrote.I believe I rememeber what you wrote earlier was a global variable outside of the function made it work faster.Your comment was I believe was I'm no expert but I'm getting there.So maybe if we somehow modified it so no looping at all just straight putting the string into IE and returning the output text as a string,With the global call of IE outside of the function so its not being loaded over and over again.I looked at this function about a month but I'm really stubborn and finally came to the realization that this is the best that will suit my needs.Ive tried stripping with regular functions and it didn't strip everything because there are zillions of different ways of writing code.If all possible to help me with this suggestion would be greatly appreciated.By the way fanpages it seems like the more questions I ask the more you will be better than an expert in here.
ASKER
bruinte I hope you didn't take offense.I did look at your suggestion and tried it.IE when I tried calling it the compiler turned an error statement ref of IE invalid in this matter strStrip_HTML_Tags(strHTML , myIE)
ASKER
Fanpages or bruinte how would I write without the loop and to porperly have the global setup.
ok, not sure what your setup is but you call a function to strip the html, not sure then where the hourglass comes from except for the creation and deletion of the ie object, which can be setup otherwise in a global variable
in the top of your module put something like
Public m_myIE as Object
Then in your initializing code call the creation of the object
Set m_myIE = CreateObject("InternetExpl orer.Appli cation")
put this line in the closure of your application
Set m_myIE = Nothing
Now where ever you are in your app you can call the m_myIE object to perform its tricks if you just not set it to nothing while you still need it
call it by the lines
DoEvents
strStrip_HTML_Tags(strHTML )
Public Function strStrip_HTML_Tags(ByVal strText As String) As String
Dim strReturn As String
On Error GoTo Err_strStrip_HTML_Tags
If Not (m_myIE Is Nothing) Then
m_myIE.Navigate "about:blank"
m_myIE.Application.Documen t.Open
m_myIE.Application.Documen t.write (strText)
m_myIE.Application.Documen t.Close
strReturn = m_myIE.Document.body.inner Text
End If
Exit_strStrip_HTML_Tags:
On Error Resume Next
strStrip_HTML_Tags = strReturn
Exit Function
Err_strStrip_HTML_Tags:
On Error Resume Next
strReturn = ""
Resume Exit_strStrip_HTML_Tags
End Function
in the top of your module put something like
Public m_myIE as Object
Then in your initializing code call the creation of the object
Set m_myIE = CreateObject("InternetExpl
put this line in the closure of your application
Set m_myIE = Nothing
Now where ever you are in your app you can call the m_myIE object to perform its tricks if you just not set it to nothing while you still need it
call it by the lines
DoEvents
strStrip_HTML_Tags(strHTML
Public Function strStrip_HTML_Tags(ByVal strText As String) As String
Dim strReturn As String
On Error GoTo Err_strStrip_HTML_Tags
If Not (m_myIE Is Nothing) Then
m_myIE.Navigate "about:blank"
m_myIE.Application.Documen
m_myIE.Application.Documen
m_myIE.Application.Documen
strReturn = m_myIE.Document.body.inner
End If
Exit_strStrip_HTML_Tags:
On Error Resume Next
strStrip_HTML_Tags = strReturn
Exit Function
Err_strStrip_HTML_Tags:
On Error Resume Next
strReturn = ""
Resume Exit_strStrip_HTML_Tags
End Function
ASKER
I call the strip_Html in another function for example
Private sub_Onclick()
text3 = strip_Html(strhtml)
bla bla bla
Private sub_Onclick()
text3 = strip_Html(strhtml)
bla bla bla
then besides the inclusion of the declaration above in the module
it would become something like
Private sub_Onclick()
DoEvents
text3 = strStrip_HTML_Tags(strHTML )
it would become something like
Private sub_Onclick()
DoEvents
text3 = strStrip_HTML_Tags(strHTML
ASKER
so
Private sub_Onclick()
Set m_myIE = CreateObject("InternetExpl orer.Appli cation")
Set m_myIE = Nothing
DoEvents
text3 = strStrip_HTML_Tags(strHTML )
Private sub_Onclick()
Set m_myIE = CreateObject("InternetExpl
Set m_myIE = Nothing
DoEvents
text3 = strStrip_HTML_Tags(strHTML
did the following
had a form and a button in vb
pasted this code in the form module
---------------------
Option Explicit
Private Sub Form_Load()
Set m_myIE = CreateObject("InternetExpl orer.Appli cation")
End Sub
Private Sub Command1_Click()
Dim strHTML As String
DoEvents
strStrip_HTML_Tags (strHTML)
End Sub
Private Sub Form_Unload(Cancel As Integer)
Set m_myIE = Nothing
End Sub
---------------------
inserted a module and pasted this code
---------------------
option explicit
Public m_myIE As Object
Public Function strStrip_HTML_Tags(ByVal strText As String) As String
Dim strReturn As String
On Error GoTo Err_strStrip_HTML_Tags
If Not (m_myIE Is Nothing) Then
m_myIE.Navigate "about:blank"
m_myIE.Application.Documen t.Open
m_myIE.Application.Documen t.write (strText)
m_myIE.Application.Documen t.Close
strReturn = m_myIE.Document.body.inner Text
End If
Exit_strStrip_HTML_Tags:
On Error Resume Next
strStrip_HTML_Tags = strReturn
Exit Function
Err_strStrip_HTML_Tags:
On Error Resume Next
strReturn = ""
Resume Exit_strStrip_HTML_Tags
End Function
---------------------
had a form and a button in vb
pasted this code in the form module
---------------------
Option Explicit
Private Sub Form_Load()
Set m_myIE = CreateObject("InternetExpl
End Sub
Private Sub Command1_Click()
Dim strHTML As String
DoEvents
strStrip_HTML_Tags (strHTML)
End Sub
Private Sub Form_Unload(Cancel As Integer)
Set m_myIE = Nothing
End Sub
---------------------
inserted a module and pasted this code
---------------------
option explicit
Public m_myIE As Object
Public Function strStrip_HTML_Tags(ByVal strText As String) As String
Dim strReturn As String
On Error GoTo Err_strStrip_HTML_Tags
If Not (m_myIE Is Nothing) Then
m_myIE.Navigate "about:blank"
m_myIE.Application.Documen
m_myIE.Application.Documen
m_myIE.Application.Documen
strReturn = m_myIE.Document.body.inner
End If
Exit_strStrip_HTML_Tags:
On Error Resume Next
strStrip_HTML_Tags = strReturn
Exit Function
Err_strStrip_HTML_Tags:
On Error Resume Next
strReturn = ""
Resume Exit_strStrip_HTML_Tags
End Function
---------------------
ASKER
oh ok on form load is when you set the object!!
yes, because it will be set only once and deleted only once on program init and terminate
this way you do not hae to create and cleanup the object while doing the html strip, but if that takes away the hourglass i do not know
this way you do not hae to create and cleanup the object while doing the html strip, but if that takes away the hourglass i do not know
ASKER
i will test it out and let you know it probably should since it only sets it once and not over and over again
ASKER
ok im getting no output when I set it outside on form load.
ASKER
It works great thanks for your help I was missing the option explicit.
glad it works now, thanks for the grade :)
if there are still problems with this just comment
if there are still problems with this just comment
ASKER
One last question what does doevents do
The DoEvents() function yields execution so that the operating system can process other events executing concurrently.
DoEvents yields operation to the operating system so that it can process other events
so if you are processing long calculations or something like a download the processor timeslices areshared between the running process and the long operation
this way the user has the notion that the program is going on while the long operation is running in the background while in reality they just do a bit of their work in a tit for tat sharing of processor time
so if you are processing long calculations or something like a download the processor timeslices areshared between the running process and the long operation
this way the user has the notion that the program is going on while the long operation is running in the background while in reality they just do a bit of their work in a tit for tat sharing of processor time
ASKER
Oh so another words it won't give all of the processor to the current function.
Yes, that's correct.
Concurrently executing applications will be offered their respective timeslice of the multi-tasking environment.
DoEvents (or "Yield") calls were more useful during Windows 3.1 (or earlier) when non-pre-emptive multitasking was used. This approach allocated the machine's CPU to a process (or application) until that process yielded, or completed. It was common for the computer to "freeze" or "crash" because a single process had failed, and hence was not providing the relevant system call to yield so that other applications could be serviced.
However, the current Windows approach to use pre-emptive multitasking enables the operating system to switch between processes (applications/programs) at a pre-defined interval time to prevent any single process from taking complete control of the processor. If a process were to fail, then the remaining processes could continue unaffected.
BFN,
fp.
Concurrently executing applications will be offered their respective timeslice of the multi-tasking environment.
DoEvents (or "Yield") calls were more useful during Windows 3.1 (or earlier) when non-pre-emptive multitasking was used. This approach allocated the machine's CPU to a process (or application) until that process yielded, or completed. It was common for the computer to "freeze" or "crash" because a single process had failed, and hence was not providing the relevant system call to yield so that other applications could be serviced.
However, the current Windows approach to use pre-emptive multitasking enables the operating system to switch between processes (applications/programs) at a pre-defined interval time to prevent any single process from taking complete control of the processor. If a process were to fail, then the remaining processes could continue unaffected.
BFN,
fp.
you could try to use the xmlhttp object for this
something like
--------------
Sub h()
GetAndSave "https://www.experts-exchange.com"
End Sub
Public Function GetAndSave(sInput As String) As Boolean
Dim objXML As object
Dim fso As Object, fs As Object
Dim strMOIOutput As String
Set objXML = CreateObject("Microsoft.XM
objXML.Open "GET", sInput, False
objXML.send ""
strMOIOutput = objXML.responseText
Set fso = CreateObject("Scripting.Fi
Set fs = fso.CreateTextFile("d:\\Re
fs.write (strMOIOutput)
fs.Close
Set fs = Nothing
Set fso = Nothing
Set objXML = Nothing
End Function
--------------
sub h is only a test to see if it works
hope this helps a bit
bruintje