Solved

Excel vba to retrieve mutiple data from web site

Posted on 2010-08-29
24
727 Views
Last Modified: 2013-02-05
I am looking for vba coding to add to attached Excel file with vba. I want to be able to click the "Get Date" button and take the symbols beginning in column A row 5 of the spreadsheet and go to web site to find associated dates. Then I want vba to retrieve the date and place it next to symbol in spreadsheet. I am attaching the Excel file for use to add the necessary code. I used IE8 and searched for the date, 23-NOV-10, and found it but do not know how to write the vba code to retrieve it. I want the vba code go down the column and retrieve the date for each symbol shown. The number of symbols in column "A" can be upwards or 300 plus.

Once in a while I get and error, but I think it is because I close the web site before the coding completes its iterations.
 
File is Excel 2007

Thanks in advance for your help!
 Test-Retrieval.xlsm
0
Comment
Question by:sumboddie
  • 11
  • 11
  • 2
24 Comments
 
LVL 33

Expert Comment

by:Norie
ID: 33557531
What data do you want to get for each ticker?

By the way is there a page where you can enter the symbol and then
initiate the search via a button or something?

That might be easier to work with rather than a separate URL for each syymbol.

ie navigate to search page, insert symbol, do search, get results.
0
 
LVL 17

Expert Comment

by:calacuccia
ID: 33557898
Hi

90        Do While ie.busy Or Not ie.ReadyState = IE_READYSTATE.Complete: DoEvents: Loop

        htmlStr = ie.document.body.innerHTML
     
        pointer = InStr(1, htmlStr, "Future Events")
        pointer = InStr(pointer, htmlStr, "vAlign")
        htmlStr = Mid(htmlStr, pointer + 18, 9)
        Range("B5").Value = htmlStr

100       Set doc = ie.document

The additional lines between 90 & 100 passed the test on your first sample (A5), did not test it further, but I think it should work for all.
Sub Test()

          

          Dim ie As Object

          Dim doc As HTMLDocument

          Dim PageForm As HTMLFormElement

          Dim FormButton As HTMLInputButtonElement

          Dim Elem As IHTMLElement

          Dim xURL As String

          Dim xSym As String

          Dim htmlStr As String

          Dim pointer As Long, spointer As Long

         

          

10        Sheets("Test").Activate

20        xURL = Range("A2").Value

30        xSym = Range("A6").Value

          

40        xURL = xURL + xSym

          

50       On Error GoTo Test_Error



60        Set ie = CreateObject("InternetExplorer.Application")

          

70        ie.Visible = True

80        ie.navigate xURL

          

          'Wait for initial page to load

          

90        Do While ie.busy Or Not ie.ReadyState = IE_READYSTATE.Complete: DoEvents: Loop



        htmlStr = ie.document.body.innerHTML

     

        pointer = InStr(1, htmlStr, "Future Events")

        pointer = InStr(pointer, htmlStr, "vAlign")

        htmlStr = Mid(htmlStr, pointer + 18, 9)

        Range("B6").Value = htmlStr



100       Set doc = ie.document

          

110      On Error GoTo 0

120      Exit Sub



Test_Error:



130       MsgBox "Error " & Err.Number & " (" & Err.Description & ") at line # " & Erl & " in procedure Test of VBA Document Sheet5"

         

End Sub

Open in new window

0
 
LVL 17

Expert Comment

by:calacuccia
ID: 33557917
By the way, I just see imnorie's comment, of course, you should keep your same instance throughout the record, but I figured that you were well aware of that and your main problem right now was to retrieve the data you need.
0
 
LVL 33

Expert Comment

by:Norie
ID: 33558274
calacuccia

I wasn't specifically thinking about the multiple instances thing, I think anyway.

Just wondering if there was a search page/form, and there appears to be.

I also wanted to know what data the OP wanted to get.

There seems to be a lot more than the Future Events date.

By the way all the results seem to be in a div with the ID 'ResultsDiv'.

It has 2 tables, one for future events and one a sort of archive.

Update - can't seem to find out how to get the search to work.

Anyway here's the code I tried, which doesn't error but doesn't return the required results.
'Need reference to Microsoft HTML Object Library.  Select this in Tools - References in VB editor.



Option Explicit



Public Enum IE_READYSTATE

     Uninitialised = 0

     Loading = 1

     Loaded = 2

     Interactive = 3

     Complete = 4

 End Enum



Sub Test()



Dim ie As Object



Dim doc As HTMLDocument

Dim divRslts As HTMLDivElement

Dim PageForm As HTMLFormElement

Dim valSearch As HTMLInputElement

Dim Elem As IHTMLElement

Dim xURL As String







  



    xURL = Sheets("Test").Range("A2").Value



    Set ie = CreateObject("InternetExplorer.Application")



    ie.Visible = True

    ie.navigate xURL



    'Wait for initial page to load



    Do While ie.busy Or Not ie.ReadyState = IE_READYSTATE.Complete: DoEvents: Loop



    Set doc = ie.document

    

    Set PageForm = doc.forms("form1")

    

    Set valSearch = doc.getElementById("SearchTicker")

    

    valSearch.Value = Sheets("Test").Range("A5").Value

    

    PageForm.submit

    

    Do While ie.busy Or Not ie.ReadyState = IE_READYSTATE.Complete: DoEvents: Loop

    

    Set divRslts = doc.getElementById("ResultsDiv")

    

    GetAllTables divRslts.document

    

    ie.Quit



End Sub



Sub GetAllTables(d)

Dim ws As Worksheet

Dim rng As Range

Dim t As HTMLTable

Dim r As HTMLTableRow

Dim c As HTMLTableCell

Dim tabno As Long

Dim nextrow As Long

Dim I As Long



    Set ws = Worksheets.Add



    For Each t In d.getElementsByTagName("TABLE")



        tabno = tabno + 1

        nextrow = nextrow + 1

        Set rng = ws.Range("B" & nextrow)

        rng.Offset(, -1) = "Table " & tabno

        For Each r In t.Rows

            For Each c In r.Cells

                rng.Value = c.outerText

                Set rng = rng.Offset(, 1)

                I = I + 1

            Next c

            nextrow = nextrow + 1

            Set rng = rng.Offset(1, -I)

            I = 0

        Next r



    Next t



    ws.Cells.ClearFormats

End Sub

Open in new window

0
 

Author Comment

by:sumboddie
ID: 33559161
Hi Calacuccia,
Your code is on track. We are able to return the date for the second symbol in column "A". And I agree with imnorie, and was thinking that after I posted my question, that we want to keep the same instance open to retrieve all other data.

So can we modify the code to always start at the first symbol and progress down column "A" to the last symbol? For the purpose of this example we are using three symbols, but typical usage will involve about 300 or so symbols. I will probably schedule this to run late at night so it can have the time it needs to update all dates in the list.

BTW your coding insight gives me confidence in my perception of what code was needed, and how and where in the HTML it would retrieve the data in which I am interested.
0
 
LVL 33

Expert Comment

by:Norie
ID: 33559663
sumboddie

The following code will extract the data for each ticker to a new worksheet.

It extracts all the data in the results table(s) if you really only want the future events that shouldn't
be too difficult.

As well as code for getting data from all the tables I can post code that can get it from specific tables.

Parsing the HTML is one method for this sort of thing and it definitely works, but if you are automating
IE to navigate to a web page etc then you can access all the elements of the page via the DOM object
model.

The document object is at a pretty high level in that model but below it you have all the other elements on
the page, eg tables, forms, inputs etc.


'Need reference to Microsoft HTML Object Library.  Select this in Tools - References in VB editor.



Option Explicit



Public Enum IE_READYSTATE

     Uninitialised = 0

     Loading = 1

     Loaded = 2

     Interactive = 3

     Complete = 4

 End Enum



Sub Test()



Dim ie As Object



Dim doc As HTMLDocument

Dim divRslts As HTMLDivElement

Dim PageForm As HTMLFormElement

Dim valSearch As HTMLInputElement

Dim Elem As IHTMLElement

Dim xURL As String

Dim rngTicker As Range



    Set rngTicker = Sheets("Test").Range("A5")



    xURL = Sheets("Test").Range("A2").Value



    Set ie = CreateObject("InternetExplorer.Application")



    ie.Visible = True



    While rngTicker <> ""



        ie.navigate xURL & rngTicker



        'Wait for initial page to load



        Do While ie.busy Or Not ie.ReadyState = IE_READYSTATE.Complete: DoEvents: Loop



        Set doc = ie.document



        Set divRslts = doc.getElementById("ResultsDiv")



        GetAllTables divRslts.document

        Set rngTicker = rngTicker.Offset(1)

        

    Wend

    

    ie.Quit



End Sub



Sub GetAllTables(d)

Dim ws As Worksheet

Dim rng As Range

Dim t As HTMLTable

Dim r As HTMLTableRow

Dim c As HTMLTableCell

Dim tabno As Long

Dim nextrow As Long

Dim I As Long



    Set ws = Worksheets.Add



    For Each t In d.getElementsByTagName("TABLE")



        tabno = tabno + 1

        nextrow = nextrow + 1

        Set rng = ws.Range("B" & nextrow)

        rng.Offset(, -1) = "Table " & tabno

        For Each r In t.Rows

            For Each c In r.Cells

                rng.Value = c.outerText

                Set rng = rng.Offset(, 1)

                I = I + 1

            Next c

            nextrow = nextrow + 1

            Set rng = rng.Offset(1, -I)

            I = 0

        Next r



    Next t



    ws.Cells.ClearFormats

    

End Sub

Open in new window

0
 

Author Comment

by:sumboddie
ID: 33560961
Hi imnorie,
You present a very interesting concept and your code provides more than I am looking for so maybe with some modification I will have what I want and more. Let's take the data retrieved from Tables 5 and 6 only.

With the symbol in Column A, put the data from Table 5 in columns B, C, and D.

With the symbol in Column A, put the data for the first record from Table 6 in columns E, F, G; the data for the second record from Table 6 in columns H, I, and J; the data for the third record from Table 6 in columns K, L, and M; and continue this scheme through the 10th record from Table 6. I want all data to be contained in the main spreadsheet and not add a new spreadsheet for each symbol.

Your solution goes above and beyond my expectations and I will award more points to accomplish the scheme described herein.
Many Thanks!!

0
 
LVL 33

Expert Comment

by:Norie
ID: 33561080
Put what data in columns B, C and D?

If you are going to be so specific about what data you want then tell us what it is.

In your original file all you seemed to want is a date.

There was no mention of any other data, and definitely nothing about the archive stuff.

Like I said I've got code that can extract specific tables.

Obviously when working with Excel the data goes to a worksheet, and in some cases a new worksheet is the best place to put.

Also are you sure you want 90 columns of archived data?

It can be done but the structure of the data wouldn't make it very conducive to any further analysis, filtering etc.
0
 

Author Comment

by:sumboddie
ID: 33561231
Hi imnorie,
I apologize for not being clear in which data I want in the expanded request. Sometimes we are so deep in the woods and know what we are looking for that we forget others may only be able to see the forest :-).

From Table 5 I would like the Date, Page, and Event data only.

From Table 6 I would like the Date, Page, and Headline data only.

As I said in my previous post, you presented an interesting concept by looking at the table contents rather than just the date piece of data. Now, I do want more than the date mentioned in my original post.
I do not want the data posted in a new worksheet, only added to the existing worksheet. And yes, 90 columns of data is OK. Once I have the code I anticipate you will provide, I will be able to take it from there to manipulate the data using vba as I need.

Thanks for allowing me to clarify my needs!
0
 
LVL 33

Expert Comment

by:Norie
ID: 33561440
Well I'll post code for maybe 1 or tickers and you can take it from there, but I still don't think arranging the
data across columns like you want is the best idea.

What sort of manipulation would you be doing with the data?
0
 

Author Comment

by:sumboddie
ID: 33561786
I'm thinking that if we use a counter to step through the rows for each symbol that is in column A, I can vary that list at will. I have a way to easily retrieve the list from another source that will fill column A to the depth necessary (i.e., 300 plus or minus). I was thinking of using code like:

      Sheets("Test").Select
                     Endrow = Range("A65536").End(xlUp).Row
      For counter = 5 To Endrow
                     xSym = Cells(counter, 1).Value

Then step through the rows retrieving data to fill in the columns as describe in my earlier posts.

I plan to manipulate data by prioritizing it in a sort that looks for recent dates and/or looking for certain key words. I will probably export/import the completed spreadsheet into MS Access for storage and further analysis. There would be a benefit to be able to retrieve a narrow data range from Table 6 based on a date window of say a week. The data will used elsewhere so the raw data retrieved is what is important and not how it is presented in the spreadsheet.

Hope this helps!!
0
 
LVL 33

Accepted Solution

by:
Norie earned 500 total points
ID: 33562022
Well I'm not sure about your ideas about the data.

How do you propose to sort data across columns?

Also, how will you search across columns?

Neither of these are easy to do in Excel, or Access for that matter.

The best format for the data would be a table with 4 main fields.

If I was doing it in Access the table might have the following:

TickerID
fldTicker
fldDate
fldPage
fldEventHeadLine

Anyway I digress, here's code that works for me and I think it does what you want.

It isn't perfect - I did use a spare worksheet for the initial data extract and the dates might need formatted.

Both of which can easily be taken care of.
Option Explicit



'Need reference to Microsoft HTML Object Library.  Select this in Tools - References in VB editor.

Public Enum IE_READYSTATE

     Uninitialised = 0

     Loading = 1

     Loaded = 2

     Interactive = 3

     Complete = 4

 End Enum



Sub Test()



Dim ie As Object



Dim doc As HTMLDocument

Dim divRslts As HTMLDivElement

Dim PageForm As HTMLFormElement

Dim valSearch As HTMLInputElement

Dim Elem As IHTMLElement

Dim rngRslts As Range

Dim rngTicker As Range



Dim I As Long

Dim xURL As String



    Set rngTicker = Sheets("Test").Range("A5")



    xURL = Sheets("Test").Range("A2").Value



    Set ie = CreateObject("InternetExplorer.Application")



    ie.Visible = True



    While rngTicker <> ""



        ie.navigate xURL & rngTicker



        'Wait for initial page to load



        Do While ie.busy Or Not ie.ReadyState = IE_READYSTATE.Complete: DoEvents: Loop



        Set doc = ie.document



        Set divRslts = doc.getElementById("ResultsDiv")



        'GetAllTables divRslts.document



        GetOneTable divRslts.document, 4, Worksheets("Data").Range("A1")



        GetOneTable divRslts.document, 5, Worksheets("Data").Range("F1")



        Set rngRslts = rngTicker.Offset(, 1)

        

        Worksheets("Data").Range("A2:C2").Copy rngRslts



        For I = 1 To 10



            Worksheets("Data").Range("F1:H1").Offset(I).Copy rngRslts.Offset(, 4 + (I * 3))

            

        Next I



        Set rngTicker = rngTicker.Offset(1)



    Wend



    ie.Quit



End Sub





Sub GetOneTable(d As HTMLDocument, n As Long, rng As Range)

' d is the document

' n is the table to extract

Dim t As Object    ' the table required

Dim colTbls As Object

Dim r As Object    ' the rows of the table

Dim c As Object    ' the cells of the rows.

Dim I As Long





    Set colTbls = d.getElementsByTagName("TABLE")



    Set t = colTbls(n)



    For Each r In t.Rows

    

        For Each c In r.Cells

            rng.Value = c.innerText

            Set rng = rng.Offset(, 1)

        Next c

        

        Set rng = rng.Offset(1, -r.Cells.Length)



    Next r



End Sub

Open in new window

0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:sumboddie
ID: 33570797
Hi imnorie,
Your coding works as advertised!
==============================
However, after considering your suggestion of the table with fields:
Table 5: FUTURE EVENTS
TickerID
fldTicker
fldDate
fldPage
fldEvent

would work well in Access using data from Table 5.
===============================
Then I would use Table 6 in Access as another table with fields:
Table 6: ARCHIVED EVENTS
TickerID
fldTicker
fldDate
fldPage
fldHeadLine
================================
Please modify coding to provide the above tables. Furthermore, we need to leave cells in the spreadsheet null (blank) under the following conditions. I attached printouts of the four circumstances we need to evaluate to decide what data to retrieve, if any. Please see the attached printouts for visual aids for the following circumstances:

MHS Correct Results - Retrieve the selected data from Table 5 and 6.

RIMM Calls and Earnings - Retrieve only the earnings data and not the call data from Table 5, and the selected data from Table 6

FBP No Earnings but Archives - Leave the Table 5 data blank and retrieve only the selected Archive Table 6 data.

BBL No Results Found - Leave both the Table 5 and Table 6 data blank. The current coding appears to drop down to the table at the bottom of the page and retrieves the table with the title MARKETPLACE. This is advertising and not wanted in the retrieved data.

Thanks!

MHS-Correct-Results.pdf
RIMM-Calls-and-Earnings.pdf
FBP-No-Earnings-but-Archives.pdf
BBL-No-Results-Found.pdf
0
 
LVL 33

Expert Comment

by:Norie
ID: 33571310
Excuse me, but you do realise that most people that contribute to this board do so free of charge?

If you want such a detailed solution it might be an idea to hire somebody.

Please don't post with things like 'modify coding to provide the above tables'.

You've already said that what I posted earlier was above and beyond what you wanted.

So, I changed it - mainly because it only took a few lines of code.

You now want this changed to provide you with tables, with specific data excluded... etc and now you've also mentioned Access.

I'm sorry but that's taking things a bit too far, what next  'the moon on a stick'?
0
 

Author Comment

by:sumboddie
ID: 33571487
OK, I'm not looking for Access tables, only gathering the pertinent data in Excel. I can then get it to Access. I was only concurring with your assessment of the form for the data to use in Access and acknowledge that is a good idea.

If excluding the erroneous information from the web tables that is picked up using the current coding is too much, I apologize for asking to modify the code to get only the data that is relevant.

I accept your partial solution and award the points to you! Thank you!
0
 

Author Closing Comment

by:sumboddie
ID: 33571535
Expert was quick to reply and offered several good alternatives that I will try to follow-up. The basic vba solution for Excel was timely and accurate for what I asked. Maybe I can ask a follow-on question to resolve the need for decision making code that will omit the blank or non-relevent information and offer additional points.
0
 
LVL 33

Expert Comment

by:Norie
ID: 33571647
I didn't say that anything you asked for wasn't possible or would be too difficult.

You seem to have changed your original requirement quite a bit.

If you were to import the data to Access you could probably set up simple queries to exclude any unwanted data, group data, count records, query by date etc...

As for the code being 'basic vba' - not too sure what you mean there.

Sure it's straightforward, easy to follow, easy to adapt etc, which is what code should be like.

I'm happy to help, and I might take a look at your lastest request and attachments when I get a moment.

One thought I actually had was that you could get the all the data for all the symbols all in the one table or perhaps 2 if you want future/archived separated.

Each record in the table(s) would have the fields already mentioned plus a field for the symbols to identify which one the data belongs to.

With the data structured like that it would be much easier to analyse, manipulate, report etc.
0
 

Author Comment

by:sumboddie
ID: 33576204
I agree, I did change my original requirement quite a bit because you opened my eyes to see the bigger picture instead of focusing on one piece of data that was important at the time.

Yes, I can move the data from Excel to Access and use query(ies) to select and refine the data to organize and display the data I want. I have built fairly complex Access databases with vba but I am learning Excel vba and its interaction with web pages at this time.

By basic vba I meant the first coding solution you provided what I asked for and beyond that you provided more comprehensive coding to add the enhancements I asked for.

I like the idea of getting all the data for each of the two tables into the fields identified and then I will get it into Access and match up with the specific symbols I want using queries. I agree with you that it will be easier to manipulate and report. As a matter of fact, I am building an Access application to pull all this together and using Excel as a data transport vehicle rather than a standalone component of my system makes a lot of sense.

Thanks for your follow-up!
0
 
LVL 33

Expert Comment

by:Norie
ID: 33576726
There's actually no reason to use Excel as a 'transport vehicle' for the data if you think about it.

The important part, if you like, of the code is getting the data from the webpage.

Then you can do what you want/need with it.

In this case the data is going to an Excel worksheet.

That could be an Access table.

It would perhaps involve a bit more code but if it can be done it would eliminate the need to use Excel.

I've never actually done it before - most people seem to want the data in Excel, but I might give it a try.

Not done too much with Access recently, so might be interesting.

PS You shouldn't need to match up the data to the symbols, that's the whole point of having a field for the symbol.

Mind you having a table with the list of symbols would be useful.
0
 

Author Comment

by:sumboddie
ID: 33577024
Hi imnorie,

Attached are two files, one Access and one Excel each with the initial list of symbols of interest. Both files are from Office Pro 2007.

Thanks!
Weekly-Up.xlsx
Test-DB.accdb
0
 
LVL 33

Expert Comment

by:Norie
ID: 33579035
I can only use the Excel file, my version of Access is a wee bit out of date, but that shouldn't be a problem - I can
easily import the data to Access if needed.

One question though - are the other columns of data relevant at all?

It's hard to tell especially since there are no headers/field names.
0
 

Author Comment

by:sumboddie
ID: 33579094
Not for the purpose of retrieving the desired data from the web. It is only supporting data. The data from the web is uniique to that symbol on the date the data is retrieved.
0
 
LVL 33

Expert Comment

by:Norie
ID: 33579257
So, and this is a stupid question, which column out of the 5 columns of data is the symbol?

I have no knowledge of the stock market/financial world, so I don't know it's column A - honest.:)
0
 

Author Comment

by:sumboddie
ID: 33579452
Sorry, I only put the headings in the Access file and not the Excel file. Column A has the symbols. FYI, we can educate each other ... the other columns from B to F are:
B =  Company Name
C = Day's Closing Price
D = Industry and Sub Industry of the Company
E = Exchange where symbol is listed
F = Data of Last Closing Price
 :-)
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

This tutorial explains how to create a series of drop-down lists that are dependent upon prior selections to guide (“force”) the user to make the correct selection and reduce data errors within Microsoft Excel. Excel 2010 was used for this tutorial;…
Citrix XenApp, Internet Explorer 11 set to Enterprise Mode and using central hosted sites.xml file.
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.
This Micro Tutorial will demonstrate how to use a scrolling table in Microsoft Excel using the INDEX function.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now