Solved

How to extract specific headings or links from a webpage using Excel/VBA

Posted on 2015-01-11
14
240 Views
Last Modified: 2015-01-29
How can I extract specific data from websites into an Excel spreadsheet?  

For example, consider the website http://www.asx.com.au/asx/research/company.do#!/BHP

I would like to:

1) Copy to a cell in a spreadsheet (or otherwise extract) the link under the heading "Contact Details" on that webpage (i.e.  http://www.bhpbilliton.com/)

2) Copy to a cell in a spreadsheet (or otherwise extract) the link for the file titled "Annual Report 2014" (displayed under the heading "BHP announcements" after "Financial & Periodic Reports" is selected from the dropdown menu) and then save the linked file to a specified location on my computer.

What is the easiest way to do this using VBA?
0
Comment
Question by:Let's Go
  • 6
  • 3
  • 2
  • +1
14 Comments
 
LVL 31

Expert Comment

by:Rob Henson
ID: 40544204
If the data on the web page is stored as a table within the webpage, you can create a web query in Excel to extract the table of data.

I just tried but unfortunately my security settings or version of browser won't let me see the web-page within the Web Query window in the same way as I can just see the page in my ordinary browser.

To create the web query, open a new sheet and on the Data tab select the "From Web" button on the left hand end of the ribbon. This opens up a browser like window within Excel. Enter the web address at the top of the window and the web-page will be displayed. Anything that is a data table will have a yellow arrow next to it, click the arrow and it will go Green.

For the table of Annual Report links, hopefully the URL for the PDF file will come down with the data table so clicking on the link will then open the PDF.

Thanks,
Rob H
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40547899
Thanks for your response.

Unfortunately the data I need does not have any yellow arrows identifying data tables next to the information I need (although the advertisements do!)

Any other ideas?
0
 
LVL 31

Expert Comment

by:Rob Henson
ID: 40548496
Sorry, don't have sufficient knowledge in other methods to advise.

Thanks
Rob
0
 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40557263
You should be able to import it in its entirety, and use formulas to locate that information.
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40559123
Do you mean opening the webpage from File/Open?  Unfortunately, that does not help with my question.
0
 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40559370
No, I mean opening it from Data - Get External Data - New Web Query.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40560246
Hi,
attached an Excel file. The first part is already working but I have to adjust the second one.
In general I use the XMLHttp object and directly request the Ajax data which are requested by the web page itself. Then I use a nifty library to parse the JSON string (https://code.google.com/p/vba-json/).

I will work on the second part but this might take another day.

HTH
Rainer
json.xls
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40561138
Thanks for your help, it looks like your work is going to be really helpful.
0
 
LVL 44

Accepted Solution

by:
Rainer Jeschor earned 500 total points
ID: 40562046
Hi,
and here the working final version.

HTH
Rainer
json.xls
0
 
LVL 6

Author Closing Comment

by:Let's Go
ID: 40563332
That's fantastic.  Thanks for such a quick response.
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40576795
Hi, Rainer

I have another question.  How do you find which Ajax data are requested by the web page itself?

In other words, cell D5 of your spreadsheet lists the Request URL of http://data.asx.com.au/data/1/company/BHP/announcements?callback=angular.callbacks._8&count=10&market_sensitive=false&type=03.  How do you derive that URL from the webpage mentioned in my original post?
0
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40576814
Hi,
that looks like a new question with new points :-)
Just kidding.
You can use any browser which has an extension to track the web page requests. Normally I prefer chrome but IE 11 has also decent developer tool bar.
In Chrome open the built-in developer extensions, and go to the tab network. Then simply reload/open the page and you can see all requests (images, script files and ajax calls)
HTH
Rainer
0
 
LVL 6

Author Comment

by:Let's Go
ID: 40577186
Thanks.

Here are some extra points....................... ;-)
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Suggested Solutions

A little background as to how I came to I design this code: Around 5 years ago I designed an add-in that formatted Excel files to a corporate standard, applying different cell colours and font type depending on whether the cells contained inputs,…
I found this questions asking how to do this in many different forums, so I will describe here how to implement a solution using PHP and AJAX. The logical flow for the problem should be: Write an event handler for the first drop down box to get …
The viewer will learn the benefit of using external CSS files and the relationship between class and ID selectors. Create your external css file by saving it as style.css then set up your style tags: (CODE) Reference the nav tag and set your prop…
Graphs within dashboards are meant to be dynamic, representing data from a period of time that will change each time the dashboard is updated with new data. Rather than update each graph to point to a different set within a static set of data, t…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now