• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1935
  • Last Modified:

WebBrowser1.Document.body.innerhtml

When I right-click Webbrowser1 and choose "View Source", why is it different than:

Text2 = WebBrowser1.Document.body.innerhtml
0
hrolsons
Asked:
hrolsons
  • 6
  • 6
1 Solution
 
darbid73Commented:
By definition you are only looking at the BODY and and InnerHTML of that body.

Very very generally the structure of a web page looks like this.  And you are only getting the body part.  Also any Jscripts etc will also not usually be in the body.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<HTML>
   <HEAD>
      <TITLE>My first HTML document</TITLE>
   </HEAD>
   <BODY>
      <P>Hello world!
   </BODY>
</HTML>
0
 
hrolsonsAuthor Commented:
So how do I grab the stuff before <BODY>?
0
 
darbid73Commented:
depending on what "the stuff before <BODY>" is

Text2 = WebBrowser1.document.documentElement.outerHTML
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
hrolsonsAuthor Commented:
Still not grabbing what I need.  On the following page:

http://www.loc.gov/pictures/collection/fsa/item/fsa1997000505/PP/

I need to grab the part that says "type='application/json'"

The only way I can get it to show is to manually right click and choose "View Page Source"
0
 
darbid73Commented:
I am guess here but is your real question how do I get the link

contained in here

<link rel='alternate'
              type='application/json'
              href='http://www.loc.gov/pictures/collection/fsa/item/fsa1997000505/PP/?fo=json'/>

for example

http://www.loc.gov/pictures/collection/fsa/item/fsa1997000505/PP/?fo=json

out of this web page http://www.loc.gov/pictures/collection/fsa/item/fsa1997000505/PP/

?????????


0
 
hrolsonsAuthor Commented:
Yes, in the end, that is what I want.  I just use the first part to get me close to the correct link.
0
 
darbid73Commented:
I am not on a PC so I cannot test this.  Generally these both get you closer to what you want.

Text2 = WebBrowser1.getElementsByTagName("head").Item(0).outerHTML

Open in new window


If you count starting with 0 for the first then this will give you the exact link
WebBrowser1.links.Item(m_Count)

Open in new window

0
 
hrolsonsAuthor Commented:
Is that VB6 code, or VB.NET?  Can't seem to get it working on VB6.
0
 
darbid73Commented:
here you go, here are 2 ways to get what you want,  you will have to decide based on how the owner of the website changes things which method you use

Dim header As IHTMLElement
Dim headerLinks As IHTMLLinkElement

Set header = WebBrowser1.Document.getElementsByTagName("head").Item(0)

'the link href you want is the 4th link so this get it but if they add a link before it then you will get the wrong one
Debug.Print header.getElementsByTagName("link").Item(3).href

'this loops through them all - and checks for something special in the link href
For Each headerLinks In header.getElementsByTagName("link")

If InStr(headerLinks.href, "fo=json") Then

Debug.Print headerLinks.href

End If

Open in new window

0
 
hrolsonsAuthor Commented:
Compile error:  User-defined type not defined on:

Dim header As IHTMLElement
0
 
Dymer2Commented:
Hi,
The difference is the quotes thats in the "View Source" but its not visible for the OuterHTML. If you need to investigate it further, add a Textbox to your form and put the text in it. then you can see where the differences are.
Below I have attached several ways to get the text between known tags, without using getElementByTagName.
I hope this will help you.
Cheers,
Dymer

    Dim HTML As String
    Dim i As Long
    Dim j As Long
    Dim SearchStr As String
    Dim tempStr As String
    Dim tempLength As Integer
    Dim tempNumber As Integer
    HTML = webMovie.Document.documentelement.outerhtml
'----------- Year
    SearchStr = "<p><b>Release Year</b><br>"
    i = InStr(1, HTML, SearchStr, vbTextCompare)
    If i > 0 Then
        i = i + Len(SearchStr)
        tempNumber = Val(Mid(HTML, i, 4))
        If (tempNumber > 1900) And (tempNumber < 2100) Then LocalMovie.Year = tempNumber
    End If
'----------- PG
    SearchStr = "<p><b>PG rating</b><br>"
    i = InStr(1, HTML, SearchStr, vbTextCompare)
    If i > 0 Then
        i = i + Len(SearchStr)
        j = InStr(i, HTML, "</p>", vbTextCompare)
        tempStr = Mid(HTML, i, j - i)
        If tempStr = "<BR>" Then tempStr = "N/A"
        tempLength = Len(tempStr)
        If (tempLength > 1) And (tempLength < 30) Then LocalMovie.PG = tempStr
    End If
'----------- Genre
    SearchStr = "<P><B>Genre</B><BR>"
    i = InStr(1, HTML, SearchStr, vbTextCompare)
    If i > 0 Then
        i = i + Len(SearchStr)
        j = InStr(i, HTML, "<br>", vbTextCompare)
        tempStr = Mid(HTML, i, j - i)
        LocalMovie.Genre = tempStr
    End If

Open in new window

0
 
darbid73Commented:
either

Dim header As Object
Dim headerLinks As Object

or add a reference to Microsoft HTML Library
0
 
hrolsonsAuthor Commented:
Need a "Next" in the code, but other than that, it works.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 6
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now