[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

VBA controling Internet Explorer - problem getting 'next page' of results

Posted on 2011-09-12
2
Medium Priority
?
462 Views
Last Modified: 2012-05-12
I am trying capture the links to some publications through a search engine.  There are usually more results than can be returned on one page.  You can not use the URL to get to the next page of results (as far as I can tell).  

I have been able to get IE to click the 'Next' button to get to the next page.  I have put the code in the order that I would expect it to perform the different operations.  Basically,
1. Perform search - puts you on first page (results 1-20)
2. Get desired links to publications 1- 20
3. Click 'Next'
4. Go to page with the next 20 pubs (21-40)
5. Get desired links to publication 21-40

It appears that it is getting 1-20 twice and then advancing to the next page.

Why would it do this and how can I make it perform correctly?

I have provided the code and highlighted and tried to label what I think are the problem areas.


Thanks!
Sub PMCSearch()

Dim appIE As Object ' InternetExplorer.Application
Dim sURL As String
Dim SearchTerm As String
Dim NumOfYears As Integer
Dim PubCount As Integer

Dim WorkSheetName As String
'WorkSheetName = SearchTerm & CStr(Round(Time, 0))

Dim Hits As Integer
Dim LoopCount As Integer

'-----------------------------------------------------------------------------------------------
'HOUSE KEEPING - NO QUESTIONS HERE (I THINK)
' URL input
SearchTerm = "origene"

'Create new sheet named after searchterm
WorkSheetName = SearchTerm & CStr(Hour(Time)) & CStr(Second(Time))
ActiveWorkbook.Worksheets.Add
If Len(WorkSheetName) > 31 Then
    WorkSheetName = Left(WorkSheetName, 30)
End If
ActiveSheet.Name = WorkSheetName

'Clear active sheet
ActiveSheet.Cells.Clear

'Start in A1
Range("A1").Select

'-----------------------------------------------------------------------------------------------
'THIS PART WORKS - NOQUESTIONS (I THINK)
'Start Internet Explorer
Application.ScreenUpdating = False
Set appIE = CreateObject("InternetExplorer.Application")

'USE URL TO PERFORM SEARCH

    sURL = "http://www.ncbi.nlm.nih.gov/pmc/?term=" & SearchTerm & "%5BAll+Fields%5D+AND+(%222008%2F09%2F11%22%5BPubDate%5D+%3A+%222011%2F09%2F10%22%5BPubDate%5D)&cmd=DetailsSearch"
    Debug.Print sURL
    With appIE
        .navigate sURL
        ' uncomment the line below if you want to watch the code execute, or for debugging
        .Visible = True
    End With

    ' loop until the page finishes loading
    Do While appIE.readyState <> 4
        DoEvents
    Loop

    Set HTMLDoc = appIE.document

'GET LINKS FROM PAGE AND PUT IN SPREADSHEET
    Set Publinks = HTMLDoc.Links
    Debug.Print "Links length  = " & Publinks.Length
    PubLinksLength = Publinks.Length
    For i = 3 To PubLinksLength - 1
        'THE NEXT 4 LINES ARE REQUIRED TO DORT OUR THE LINKS I WANT FROM THE REST
        Test1 = Publinks.Item(i).href
        Test2 = Publinks.Item(i - 2).href
        If Right(Publinks.Item(i).href, 4) = "trez" And InStr(1, Publinks.Item(i), "pdf") = 0 And Test1 <> Test2 Then
        Debug.Print "number " & i & " " & Publinks.Item(i).href
        'PUT THEM IN THE SPREASHEET
        ActiveCell.Value = Publinks.Item(i).href
        ActiveCell.Offset(1, 0).Select
        
        End If
    Next i
        
'-----------------------------------------------------------------------------------------------
'***PROBLEM IN THIS AREA - THIS APPEARS TO BE WHERE IT DOES NOT FUNTION LIKE I WANT IT TO

'GOTO NEXT PAGE BY CLICKING LINK
        'FIND THE 'NEXT PAGE' LINK
        Set NextButton = HTMLDoc.getElementById("EntrezSystem2.PEntrez.Pmc.Pmc_ResultsPanel.Entrez_Pager.Page")
        NextButton.Click
                
        'LET THE NEXT PAGE LOAD
        Do While appIE.readyState <> 4
            DoEvents
        Loop
        
        'GET THE NEXT PAGE AS A DOCUMENT OBJECT
        Set HTMLDoc = appIE.document

        'GET LINKS FROM PAGE AND PUT IN SPREADSHEET
'*** PROBLEM HERE???  - THIS IS STILL RETURNS THE LINKS FROM THE FIRST PAGE
'DUPLICATE CODE FROM ABOVE BUT I EXPECT THE DOCUMENT TO HAVE THE NEXT PAGE OF RESULTS FOLLOWING THE NextButton.Click
    Set Publinks = HTMLDoc.Links
    Debug.Print "Links length  = " & Publinks.Length
    PubLinksLength = Publinks.Length
    For i = 3 To PubLinksLength - 1
        'THE NEXT 4 LINES ARE REQUIRED TO DORT OUR THE LINKS I WANT FROM THE REST
        Test1 = Publinks.Item(i).href
        Test2 = Publinks.Item(i - 2).href
        If Right(Publinks.Item(i).href, 4) = "trez" And InStr(1, Publinks.Item(i), "pdf") = 0 And Test1 <> Test2 Then
        Debug.Print "number " & i & " " & Publinks.Item(i).href
        'PUT THEM IN THE SPREASHEET
        ActiveCell.Value = Publinks.Item(i).href
        ActiveCell.Offset(1, 0).Select
        
        End If
    Next i
 
End Sub

Open in new window

0
Comment
Question by:Ben210
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 16

Accepted Solution

by:
carsRST earned 2000 total points
ID: 36524789
My guess is the page is still loading, even though you have the "appIE.readyState <> 4"


You might try this...

Do While appIE.Busy Or appIE.readyState <> 4
   DoEvents
Loop



If that doesn't work, then you might start testing for something unique on the 21-40 page that you know is there.  When you see that the value show up, then proceed with 21-40.



Third option would just be to kill enough time through "do while" loops that you know for a fact the page has loaded.  Then proceed on with your 21-40 code.
0
 

Author Comment

by:Ben210
ID: 36524860
Adding 'Do while appIE.Busy Or ' worked!!!

I love a simple solution!

Thanks so much.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Some code to ensure data integrity when using macros within Excel. Also included code that helps secure your data within an Excel workbook.
After seeing numerous questions for Dynamic Data Validation I notice that most have used Visual Basic to solve the problem. This suggestion is purely formula based and can be used in multiple rows.
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.
Many functions in Excel can make decisions. The most simple of these is the IF function: it returns a value depending on whether a condition you describe is true or false. Once you get the hang of using the IF function, you will find it easier to us…

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question