Solved

How to extract specific headings or links from a webpage using Excel/VBA

Posted on 2015-01-11
14
363 Views
Last Modified: 2015-01-29
How can I extract specific data from websites into an Excel spreadsheet?  

For example, consider the website http://www.asx.com.au/asx/research/company.do#!/BHP

I would like to:

1) Copy to a cell in a spreadsheet (or otherwise extract) the link under the heading "Contact Details" on that webpage (i.e.  http://www.bhpbilliton.com/)

2) Copy to a cell in a spreadsheet (or otherwise extract) the link for the file titled "Annual Report 2014" (displayed under the heading "BHP announcements" after "Financial & Periodic Reports" is selected from the dropdown menu) and then save the linked file to a specified location on my computer.

What is the easiest way to do this using VBA?
0
Comment
Question by:Let's Go
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 3
  • 2
  • +1
14 Comments
 
LVL 33

Expert Comment

by:Rob Henson
ID: 40544204
If the data on the web page is stored as a table within the webpage, you can create a web query in Excel to extract the table of data.

I just tried but unfortunately my security settings or version of browser won't let me see the web-page within the Web Query window in the same way as I can just see the page in my ordinary browser.

To create the web query, open a new sheet and on the Data tab select the "From Web" button on the left hand end of the ribbon. This opens up a browser like window within Excel. Enter the web address at the top of the window and the web-page will be displayed. Anything that is a data table will have a yellow arrow next to it, click the arrow and it will go Green.

For the table of Annual Report links, hopefully the URL for the PDF file will come down with the data table so clicking on the link will then open the PDF.

Thanks,
Rob H
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40547899
Thanks for your response.

Unfortunately the data I need does not have any yellow arrows identifying data tables next to the information I need (although the advertisements do!)

Any other ideas?
0
 
LVL 33

Expert Comment

by:Rob Henson
ID: 40548496
Sorry, don't have sufficient knowledge in other methods to advise.

Thanks
Rob
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40557263
You should be able to import it in its entirety, and use formulas to locate that information.
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40559123
Do you mean opening the webpage from File/Open?  Unfortunately, that does not help with my question.
0
 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40559370
No, I mean opening it from Data - Get External Data - New Web Query.
0
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40560246
Hi,
attached an Excel file. The first part is already working but I have to adjust the second one.
In general I use the XMLHttp object and directly request the Ajax data which are requested by the web page itself. Then I use a nifty library to parse the JSON string (https://code.google.com/p/vba-json/).

I will work on the second part but this might take another day.

HTH
Rainer
json.xls
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40561138
Thanks for your help, it looks like your work is going to be really helpful.
0
 
LVL 44

Accepted Solution

by:
Rainer Jeschor earned 500 total points
ID: 40562046
Hi,
and here the working final version.

HTH
Rainer
json.xls
0
 
LVL 6

Author Closing Comment

by:Let's Go
ID: 40563332
That's fantastic.  Thanks for such a quick response.
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40576795
Hi, Rainer

I have another question.  How do you find which Ajax data are requested by the web page itself?

In other words, cell D5 of your spreadsheet lists the Request URL of http://data.asx.com.au/data/1/company/BHP/announcements?callback=angular.callbacks._8&count=10&market_sensitive=false&type=03.  How do you derive that URL from the webpage mentioned in my original post?
0
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40576814
Hi,
that looks like a new question with new points :-)
Just kidding.
You can use any browser which has an extension to track the web page requests. Normally I prefer chrome but IE 11 has also decent developer tool bar.
In Chrome open the built-in developer extensions, and go to the tab network. Then simply reload/open the page and you can see all requests (images, script files and ajax calls)
HTH
Rainer
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40577186
Thanks.

Here are some extra points....................... ;-)
0

Featured Post

Revamp Your Training Process

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
With User Account Control (UAC) enabled in Windows 7, one needs to open an elevated Command Prompt in order to run scripts under administrative privileges. Although the elevated Command Prompt accomplishes the task, the question How to run as script…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question