• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3498
  • Last Modified:

Copy displayed web page text with vba

I need to copy the displayed text from a web page to a string.  The text is generated by the web page and not available directly from the source.  I need to obtain with vba code the same result I would get from manually selecting the entire web page text, copying it to clipboard, then pasting it into notepad.  I am using MS Office 2014 (Access).
2 Solutions
ste5anSenior DeveloperCommented:
You need to automate your web browser to copy the text from the DOM.
riverguyAuthor Commented:
ste5an, thanks for the answer.  I'm not familiar with using DOM.  Would you mind giving me an example.  The information I want is not available in the source HTML, but is obtained by embedded javascript within it.   The web page is a movie schedule at http://www.fandango.com/theaterpage-prn.aspx?tid=AAWVN&date=12%2f4%2f2014 where the date elements are included in the URL.  I want to copy the output string.
You can get the HTML quite easily with the MSXML2.XMLHTTP object.  What are you doing with the "text" on the page?
Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

riverguyAuthor Commented:
I was mistaken, the text I'm looking for is available in the source text of the web page and I can get it as you suggested with an MSXML2 object in .responseText in the HTML format.  I'll have to ferret out pulling the data from the HTML.  I'm wondering if there is an object property that represents the output text, the same text one would get with selecting and copying to the clipboard all the contents of the web page, or will it have to be parsed out node by node?  I can probably go it from here.  Thanks for the help.  I'm open to any suggestions for extracting the output text.
To get the entire innertext of a web page use the "Microsoft HTML Object Library"
'add a reference to Microsoft WinHttp Services 5.0
'add a reference to Microsoft HTML Object Library

Private Sub CommandButton1_Click()
    Dim Req As New WinHttpRequest
    Dim RT As String, HasData As Boolean
    Dim Doc As New HTMLDocument
    Req.Open "GET", "http://en.wikipedia.org/wiki/Rashmi_Gautam"
    RT = Req.ResponseText

    Set Doc = New HTMLDocument
    CallByName Doc, "Write", VbMethod, RT

    MsgBox Doc.Body.innertext
End Sub

Open in new window

riverguyAuthor Commented:
Thanks, that did it.  My mistake was in not recognizing that the information was in the HTML.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Easily Design & Build Your Next Website

Squarespace’s all-in-one platform gives you everything you need to express yourself creatively online, whether it is with a domain, website, or online store. Get started with your free trial today, and when ready, take 10% off your first purchase with offer code 'EXPERTS'.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now