Marc Salant
asked on
VBA access to DOM and elements on web page
I am trying to access a webpage using VBA. When I get the results from the webpage, using winhttprequest or msxml2.xmlhttp60, I am getting the page source, but this has yet to process the javascript which will then give me the elements that I need. I need to access the results that can be seen on the DOM when I inspect elements on either Chrome or IE. I can't figure out how to get to this processed page results. I have tried to use the IE controls in VBA and passing the request through IE, but to no avail
ASKER
yes, that is correct. I want to read/parse the rendered webpage.
I am hoping to do this using VBA in excel, but am stuck. I have used in the past the winhttprequest libraries, but this is bringing me back the raw server response... tried to use an IE object, and XMLhttp, but no luck.
Thanks
I am hoping to do this using VBA in excel, but am stuck. I have used in the past the winhttprequest libraries, but this is bringing me back the raw server response... tried to use an IE object, and XMLhttp, but no luck.
Thanks
While I've seen others use the web browser control to great effect in Excel, I've only used it in Access. Hopefully someone who is familiar with using it in Excel will join in.
In the meantime, I'll render as much assistance as I can.
In the meantime, I'll render as much assistance as I can.
Start with this:
where WebBrowser is the reference to the web browser library
Some references you will want to add from your IDE's Tools/References... menu item are:
Microsoft Internet Controls
Microsoft HTML Object Library
Microsoft Scripting Runtime
After you have added the wb object declaration to a VBA code module, just type "wb" in a procedure and then type a dot (.) after it and watch the intellisense bring up all the properties and methods you can use.
Of particular importance is the "Document" object.
I'm sure you can take it from there...
Public WithEvents wb As WebBrowser
where WebBrowser is the reference to the web browser library
Some references you will want to add from your IDE's Tools/References... menu item are:
Microsoft Internet Controls
Microsoft HTML Object Library
Microsoft Scripting Runtime
After you have added the wb object declaration to a VBA code module, just type "wb" in a procedure and then type a dot (.) after it and watch the intellisense bring up all the properties and methods you can use.
Of particular importance is the "Document" object.
I'm sure you can take it from there...
Also, what's really cool about the web browser control's BeforeNavigate2() method:
Private Sub wb_BeforeNavigate2( ByVal pDisp As Object, ByRef URL As Variant, ByRef Flags As Variant, ByRef TargetFrameName As Variant, ByRef PostData As Variant, ByRef Headers As Variant, ByRef Cancel As Boolean)
is that you can use it to intercept all outgoing URL transmissions to see what's going out, and restrict/block stuff you don't want going out (like cookie info, advertising site URLs, or your browsing history to Google, etc.)!
p.s. The web browser control in .Net does NOT have the Navigate2() method, so you can't do the cool stuff you can do with the Office web browser in .Net!
MS Office/Access/VBA/Webbrows er.... YOU ROCK!
MS Office/Access/VBA/Webbrows
ASKER
I am looking into it... thanks for the direction.
ASKER
sadly, there is a known issue with the web browser object in excel 2013. I need to edit the registery... ha.
once I call the webpage, how do I ask for the rendered source code and not the page source?
Thanks
once I call the webpage, how do I ask for the rendered source code and not the page source?
Thanks
ASKER
btw, I can probably work in access, really doesn't matter since I am just using the vba shell, not really the front end.
Take a look at this and see if it doesn't point you in the right direction:
https://docs.microsoft.com /en-us/dot net/framew ork/winfor ms/control s/webbrows er-control -overview
https://docs.microsoft.com
In the WebBroswser control you always access the rendered page by the Document property, which contains the rendered DOM results. You might have to "click" buttons or whatever is required to trigger the required JavaScript code to get access to your desired page.
ASKER
I continued to look for a solution over the weekend. Does not seem like it is very easily accessible to get the rendered page. Definitely not easily possible with the webbrowser objects. Several posts about using Selenium, but I looked for that and didn't have great success...
I want to access that rendered DOM page. My selection for the javascript control are all set via the url, so what I am trying to do is possible, it's just the page source returned is not the final rendered page I am looking at on the screen.
https://shop.ford.com/showroom/?gnav=header-shop&linktype=inventory#/
Trying to look at cars at the dealerships.
I want to access that rendered DOM page. My selection for the javascript control are all set via the url, so what I am trying to do is possible, it's just the page source returned is not the final rendered page I am looking at on the screen.
https://shop.ford.com/showroom/?gnav=header-shop&linktype=inventory#/
Trying to look at cars at the dealerships.
This question needs an answer!
Become an EE member today
7 DAY FREE TRIALMembers can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
I do it all the time with a web browser control in MS Access as an application.