Link to home
Start Free TrialLog in
Avatar of wildcard76
wildcard76Flag for Türkiye

asked on

How to convert httpWebResponse into mshtml.HTMLDocument

Hi everybody,

I have something like

Private doc As New mshtml.HTMLDocument
Dim sStream As Stream
Dim URLReq As HttpWebRequest
Dim URLRes As HttpWebResponse
URLReq = WebRequest.Create(v_crawlURL)
URLRes = URLReq.GetResponse()
sStream = URLRes.GetResponseStream()
Dim sr As New StreamReader(sStream)
Dim cont As String = sr.ReadToEnd

now I have the content of the page in the cont variable but I'm stuck here. How do I get it into a mshtml.HTMLDocument object ?

Should not be that hard...


Thanks in advance

Regards
ASKER CERTIFIED SOLUTION
Avatar of Bob Learned
Bob Learned
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of wildcard76

ASKER

hi,

the class seems really solid and substantial
but when I try to do something like

        Dim ht As New HtmlDocument("http://www.google.com")
        Dim a As HtmlAnchor
        For Each a In ht.Anchors
            MessageBox.Show(a.HRef)
        Next

anchors property always returns empty regardless of the url i provide.

what may i be doing wrong ?

regards



actually images property is also empty...

raising points btw...

regards
ok when I debug and examine the document after m_document = DirectCast(doc.createDocumentFromUrl(m_url, vbNullString), mshtml.HTMLDocument) is executed, i noticed the below errors...

      baseUrl      <error: an exception of type: {System.NotImplementedException} occurred>      String
      enableDownload      <error: an exception of type: {System.NotImplementedException} occurred>      Boolean
      frames      <error: an exception of type: {System.InvalidCastException} occurred>      mshtml.FramesCollection
      IHTMLDocument2_frames      <error: an exception of type: {System.InvalidCastException} occurred>      mshtml.FramesCollection
      IHTMLDocument2_location      <error: an exception of type: {System.InvalidCastException} occurred>      mshtml.HTMLLocation
      IHTMLDocument2_parentWindow      <error: an exception of type: {System.InvalidCastException} occurred>      mshtml.IHTMLWindow2
      IHTMLDocument2_Script      <error: an exception of type: {System.InvalidCastException} occurred>      Object

and several others...

      title      "Google"      String

is present as well which means the sub actually connects to the url and receives some data...

regards

.NET version?

Bob
1.1
1.1 4322 to be precise
What type of application are you running from?  WinForms?  ASP.NET?

Bob
it is a winforms application...
I've experienced something before, while I was using a axWebBrowser control on a form, when I used the navigate2 method on the control when the form is not visible, I had a cominvalidstate exception, which I corrected by simply showing the form before navigaitng... can it be something similar... because there's no visible controls etc... I know it's a far shot:)
found it...      

Dim oDoc As New mshtml.HTMLDocument
        Dim iDoc As mshtml.IHTMLDocument2 = oDoc

        'write to the IHTMLDocument2
        iDoc.write(cont)
        iDoc.close()

        'get it back to an HTMLDocument
        oDoc = iDoc

does the trick....

thanks for the help...
Ok, so next time I ask if you want to use WebBrowser control.  ;)  That was code that you could use (if it worked), that you only need a URL, and not any ActiveX control, running through interoperability.

Bob