PDF download from Web via VBA - going through a security page

I am trying to download a page from the web and have to go through two security pages before getting to the PDF file.  The attached code is getting me through the security but my last part is apparantly opening a new browser windows before attempting to download the pdf.  I need it to use the open window and download the pdf.

The function (URLDownloadToFile) called in the code is :
Public Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" (ByVal pCaller As Long, ByVal szURL As String, ByVal szFileName As String, ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

I have set references to InetTransferLib and Microsoft Internet Controls.

Any assistance would be greatly appreciated.
Dim ieApp As InternetExplorer
    Dim iePage As HTMLDocument
    Dim x As Single
    Dim timestart As String
    Const cTIME = 10000 'in MilliSeconds
    
    Set ieApp = New InternetExplorer
    
    ieApp.Visible = True
    
    'This changes based upon the document needed, hardcoded here for the example
    ieApp.Navigate "https://ecf.ganb.uscourts.gov/cgi-bin/show_case_doc?20,729812,,14604849,"
    
    'wait for page to load
    Do Until ieApp.ReadyState = READYSTATE_COMPLETE
    Loop
    
    Set iePage = ieApp.Document
    
    'Enter the User information and password in the first web page
    iePage.Forms(0).Item("login").Value = "cw0133"
    iePage.Forms(0).Item("key").Value = "4scirvy2"
    iePage.Forms(0).Item("clcode").Value = "system"
    iePage.Forms(0).Item("button1").Click
    
    'Wait for the second page to load and hit the submit button
    Call sSleep(cTIME)
    ieApp.Document.Forms(0).submit
    
    'Wait for the PDF file to load
    Call sSleep(cTIME)
        
    '***********************************************************************************************
    'This is my problem area it opens a new browser instead of using the one that has gone through the security check
    'Download the PDF file  - hardcoded url and file name for this example
    If URLDownloadToFile(0&, "https://ecf.ganb.uscourts.gov/cgi-bin/show_case_doc?20,729812,,14604849,", "c:\TEMP\TEST110908.PDF", 0&, 0&) = 0 Then
        MsgBox "Downloaded"
    Else
        MsgBox "Failed"
    End If
    '***********************************************************************************************
 
 
 
    ' Kill the browser
    Set ieBrowser = Nothing

Open in new window

marshalldavisAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DanRollinsCommented:
You might consider using the

    XMLHttpRequest Object
    http://msdn.microsoft.com/en-us/library/ms535874(VS.85).aspx
of the existing browser. Presumably the session variables would still be in place. The name is decepetive -- this has nothing to do with XML... the oReq.responseBody is the raw bytes of the requested data (in this case, PDF file). All you would need to do is write the data to disk.
0
marshalldavisAuthor Commented:
I am unfamiliar with the object and will research and try to implement a solution and report back.  Thank you so much for giving me a direction to try.
0
marshalldavisAuthor Commented:
Dan,

I spent a good deal of time yesterday pursuing the answer and trying to become familiar with the XMLHttpRequest object.

It seems to require that I must provide it a URL and does not use the exisiting window I have open under the InternetExplorer object.

However, unless I missed it, the XMLHttpRequest does not seem to have the same ability to 'navigate' the pages like the IE object.  My system has to be able to get through the user name and password page as in the code I posted above.

Can the XMLHttpRequest navigate pages or be tied into the open IE object window?  If so, can you assist me with what method to use?  I did get it to download a page when it got to a page - so I know that part should work.

Thank you so much for responding.  This problem is causing a huge backload of work for me and your efforts are greatly appreciated.

0
Newly released Acronis True Image 2019

In announcing the release of the 15th Anniversary Edition of Acronis True Image 2019, the company revealed that its artificial intelligence-based anti-ransomware technology – stopped more than 200,000 ransomware attacks on 150,000 customers last year.

DanRollinsCommented:
In your IE instance, if you can access the DOM, then you should be able to access the window object. With IE 7, the XMLHttpRequest object is a member of the window object (as shown in the second example code in the link I provided).
As to "navigating" ... that is mainly a concept associated with interactively browsing; that is, a person looking at a page, clicking a link, looking at a different page, etc. In most cases where automation is the goal, there is not really any reason to display the page to a person -- one just gets the response and processes it (without ever needing to display it). In this case, it sounded like your goal is to download a file and save it do disk, so I assumed that the download (not the display) is the issue.
0
marshalldavisAuthor Commented:
Dan,

You are correct.  I really don't care about the display at all.  I simply need it to get through the first two security pages before it can "see" the pdf to download.

The code I have does accomplish this but both methods of download I have tried (URLDownloadToFile and XMLHttpRequest) seem to rely on going directly to the PDF page and are blocked by the security pages.

I will look into the DOM/Window object as you suggest and see if I can find that last magical piece I need to grab the pdf file.

Thanks for the continuing advice.
0
DanRollinsCommented:
The key issue is that the server needs to think that the same user (same browser instance) is requesting both items (the login page and the download page).  It does so by setting up a session ID in a local cookie and that session ID gets passed back to the server with each subsequent request.   You need to find a way that uses the same instance for both requests.  The UrlDownloadToFile API must not be passing that cookie/session ID.  I believe that the XMLHttpRequest object of the window of the browser control will do the same thing viv-a-vis cookie handling as would a browser.
0
marshalldavisAuthor Commented:
Dan,

Thanks.  I spent much of the weekend and today and am continuing to search for the answer.
0
rockiroadsCommented:
Well possibly your code is working but failing maybe on the login. I noticed that the 2nd time you try it, the login prompts are not there. Perhaps that login info was saved in a cookie or something, not sure.

As a test, I ignored errors but switched it back on after
eg
    On Error Resume Next
    iePage.Forms(0).Item("login").Value = "cw0133"
    iePage.Forms(0).Item("key").Value = "4scirvy2"
    iePage.Forms(0).Item("clcode").Value = "system"
    iePage.Forms(0).Item("button1").Click
   
    'Wait for the second page to load and hit the submit button
    On Error Goto 0
    Call sSleep(cTIME)
    ieApp.Document.Forms(0).submit


Ideally the best way forward is to add validation. Check to see if those form items exist when you expect them to.

And since you defined iePage, you could just do iePage.Forms(0).submit but makes no difference really
0
marshalldavisAuthor Commented:
rockiroads,

Thank you for the response.  I have encountered the failure on subsequent attempts (I agree it is probably a cookie).  It has not been something I have attacked since I can't get the download to work when it does disply the pdf on the first try.

The frustrating part is that the browser is displaying the pdf that I need but everything I have tried to download it has failed.  The piece I seem to be missing is whether the InternetExplorer object has essentially a "save" command/method once the pdf is displayed.

I am willing to try anything and am not at all tied to the code above if you have another solution.

Thanks for the help.

0
rockiroadsCommented:
ive had a look at using execCommand but cant get SaveAs to work. I tried Print at that came up with the dialog so I know it works but SaveAs is probably limited.
    iePage.execCommand "SaveAs", True, "c:\temp\zz.pdf"
    iePage.execCommand "Print", True

I dumped the location and that is the page you have got in URLDownload api
    Debug.Print iePage.Location

I checked the html as well and that didnt help either
    Debug.Print iePage.Body.innerHTML

I placed a breakpoint then ran the save code after the IE window loaded and showed the pdf but to no avail :(

If you look at the URL after pdf has been shown, it still points to the original url.

    If URLDownloadToFile(0&, iePage.Location, "C:\temp\zz2.pdf", 0&, 0&) = 0 Then

creates zz2.pdf but is not a pdf format. it is text. I renamed it to .html and it shows the first login page titled "CM/ECF Filer or PACER Login"

Im not sure how that page is created, it looks like bytes are read and streamed so there is no real pdf as such nor webpage, more dynamically created from scanned images. Probably originally saved in some MODCA format and ran on the server.
0
marshalldavisAuthor Commented:
rockiroads,

Thank you so much for taking the time to look at the problem.  I have to be able to do this so I am going continue to search for a solution and  leave the question open for a bit longer in case anyone else might have another solution.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Access

From novice to tech pro — start learning today.