JPERKS1985
asked on
Make sure document is downloaded before parsing out elements
I have a problem that took me awhile to track down where my program begins trying to use regex on a document to extract text before the document is completely downloaded. It kept getting only half of the text correctly and dropping the other half until I ran the function again. I used LEN on the document html to figure out that the two files were dramatically different in size.
Dim EventUrl As String = "http://www.whatismyip.com"
Dim sResult As String
Dim oHttp As HttpWebRequest
Dim objResponse As WebResponse
Dim objRequest As WebRequest = System.Net.HttpWebRequest. Create(Eve ntUrl)
Dim ProxyAddress As String = "XXX.XXX.XX.X"
Dim ProxyPort As Integer = "8088"
Dim oProxy As New WebProxy(ProxyAddress, ProxyPort)
objRequest.Proxy = oProxy
objRequest.Method = "GET"
objRequest.Timeout = 1200000 ' 20 sec.
objResponse = objRequest.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(obj Response.G etResponse Stream(), System.Text.Encoding.UTF7)
sResult = sr.ReadToEnd()
sr.Close()
Dim sErr As String
Dim FullTempFilePath As String = "C:\TJYTEMP\MAINSEL" & Rnd(500) & Rnd(350) & "TEMP.html"
SaveTextToFile(sResult, FullTempFilePath, sErr)
Debug.Write("FILE WRITE ERROR: " & sErr.ToString)
Dim doc As New HtmlDocument(FullTempFileP ath)
EStatusLabel.Text = "Digesting"
Heres what the above code does, I had to make an http request through a proxy so I make the request with the proxy then saved the html to a file. This is where I think my problem lies as it must not be waiting for all of the html document to come in before saving. Then I used "Dim doc As New HtmlDocument(FullTempFileP ath)" to create a doc from that file I just saved. I need to do this so I can use the mshtml to parse out the elements in the document. I know this must be an easy fix. 500 points because I need to have the skeleton of this program developed in 2 weeks! :D
Dim EventUrl As String = "http://www.whatismyip.com"
Dim sResult As String
Dim oHttp As HttpWebRequest
Dim objResponse As WebResponse
Dim objRequest As WebRequest = System.Net.HttpWebRequest.
Dim ProxyAddress As String = "XXX.XXX.XX.X"
Dim ProxyPort As Integer = "8088"
Dim oProxy As New WebProxy(ProxyAddress, ProxyPort)
objRequest.Proxy = oProxy
objRequest.Method = "GET"
objRequest.Timeout = 1200000 ' 20 sec.
objResponse = objRequest.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(obj
sResult = sr.ReadToEnd()
sr.Close()
Dim sErr As String
Dim FullTempFilePath As String = "C:\TJYTEMP\MAINSEL" & Rnd(500) & Rnd(350) & "TEMP.html"
SaveTextToFile(sResult, FullTempFilePath, sErr)
Debug.Write("FILE WRITE ERROR: " & sErr.ToString)
Dim doc As New HtmlDocument(FullTempFileP
EStatusLabel.Text = "Digesting"
Heres what the above code does, I had to make an http request through a proxy so I make the request with the proxy then saved the html to a file. This is where I think my problem lies as it must not be waiting for all of the html document to come in before saving. Then I used "Dim doc As New HtmlDocument(FullTempFileP
ASKER
Hmm, I don't think it'll work because the file will exist only not in its full form. Anyway to make it wait till the file is fully written?
The file will not exist until it is fully written--try it.
Bob
Bob
ASKER
didin't work :(. Its as if my code keeps writing over the file it created until its full? I put msgbox (Len(Dochtml)) in the part that parses it and when it finds the remaining elements the len jumps up.
ASKER
Anything that can be done with objResponse.ContentLength? Even though I'm not sure what the full length will be?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
While m_document Is Nothing OrElse m_document.readyState <> "complete"
Application.DoEvents()
End While
Is what I used it seems to work good.
Application.DoEvents()
End While
Is what I used it seems to work good.
If it is true, then you could do something like this:
While Not System.IO.File.Exists(Full
Application.DoEvents()
End While
Bob