Avatar of Suman Devadiga
Suman DevadigaFlag for India

asked on 

Website download adobe reader could not open PDF because it is either not a supported file type or because the file has been damaged

Hello Team,

Am trying to login company tracking website and download documents from file in C drive, while opening downloaded documents from C drive I am getting error i.e.

Adobe Reader could not open pdf because it is either not a supported file type or because the file has been damaged

Can you advise is there any other way to download PDF file from website?


Option Explicit
#If VBA7 Then
        Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr) As LongPtr
#Else
        Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As Long, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As Long, _
        ByVal lpfnCB As Long) As Long
#End If

Sub DowloadSingleFile()
    Dim FileURL As String
    Dim DestinationFile As String

    FileURL = "https://expo.expeditors.com/expotr/expotr?action=com.expd.webapp.tracking.action.document.DocumentImageDownloadWebAction&Type=Image.Get&xref=IVP9OABdjwlQyfAP4d8VNzKeBtYse1TlCWJmb5gzyMxbwbgxBLP19w%3D%3D"

    DestinationFile = "C:\VBA\4750422330.pdf"

    If URLDownloadToFile(0, FileURL, DestinationFile, 0, 0) = 0 Then
        Debug.Print "file download started"
    Else
        Debug.Print "file download not started"
    End If

End Sub

Open in new window

VBA

Avatar of undefined
Last Comment
Suman Devadiga
Avatar of Kimputer
Kimputer

Code would've worked on simple websites. This site requires login, and therefore, you downloaded the login page, which Adobe Reader obviously won't display for you.
For ANY login page, you need FAR FAR more code (probably 10-fold of your current code), as first you have to investigate the login procedure, including fully understanding the traffic to and from, with minute details like which cookies to send back etc etc etc.
Very very laborous work, including the need to totally rewrite it whenever they decide to slightly change the login procedure.
Avatar of Suman Devadiga

ASKER

Hello

I have whole list of script which login to page and download the documents however i am stuck in downloading part instead of documents script downloading html page.


Option Explicit
#If VBA7 Then
        Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr) As LongPtr
#Else
        Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As Long, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As Long, _
        ByVal lpfnCB As Long) As Long
#End If

Public Sub Login()

Dim ie As New InternetExplorerMedium
Dim HTMLDoc As MSHTML.HTMLDocument
Dim Links As MSHTML.IHTMLElementCollection
Dim Docs As MSHTML.IHTMLElement
Dim Link As MSHTML.IHTMLElement
Dim FileURL As String
Dim DestinationFile As String


    'Dim ie As New InternetExplorer               'InternetExplorerMedium
    With ie
        .Visible = True
        .navigate "https://portal.expeditors.com/expo/login"

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document

            With .getElementById("j_username")
                .Focus
                .Value = "bom-sumand"
            End With
            With .getElementById("j_password")
                .Focus
                .Value = "******"
            End With
            .getElementById("signInBtn").Click
        End With
        
       While .Busy Or .readyState < 4: DoEvents: Wend
 End With
 
       With ie
         .navigate "https://expo.expeditors.com/expotr/expotr?action=com.expd.webapp.tracking.action.searches.SearchWebAction&SearchType=advancedshipment.BaseAdvancedShipmentSearch&reference=4750422330#Document_Images"
         .Visible = True
         
         While .Busy Or .readyState < 4: DoEvents: Wend
         
         Application.Wait (Now + TimeValue("0:00:10"))
                 
         Set HTMLDoc = ie.document
         
         Set Docs = HTMLDoc.getElementsByClassName("sorting_1")(4)
         Set Links = Docs.getElementsByTagName("a")
            Debug.Print Links.Length
         
            For Each Link In Links
                If Link.innerText = "Commercial Invoice" Then
                    Debug.Print Link.innerText, Link.getAttribute("href")
                    FileURL = Link.getAttribute("href")
                    MsgBox FileURL
                End If
            Next Link
        End With
            DestinationFile = "C:\VBA\4750422330.PDF"
            If URLDownloadToFile(0, FileURL, DestinationFile, 0, 0) = 0 Then
                Debug.Print "file download started"
            Else
                Debug.Print "file download not started"
            End If
End Sub

Open in new window

Avatar of Kimputer
Kimputer

Sadly, I can't help you without a valid login. I suggest your capture ALL the data you can during the code, and compare it to the capture you make while manually logging in and downloading the pdf file.
(Use Fiddler4 for that)
Could be that you shouldn't end the With statement (closing the login form, might actually delete the cookies required for the download), and you should put the download code inside the first login With section. Could be totally something else (sometimes you can't directly go the the final link, but you need to go to intermediary links for the final link becomes "valid".
Comparing the 2 captures will be quite tedious. You might see some POST and GETS are different from manual download and your code.
Avatar of Suman Devadiga

ASKER

Not sure how to use Fiddler however i have captured all the data during code and compared it to the capture data manually, but i don't see any difference

Secondly Download code i have put inside with statement, i have tried all possibilities still getting attached error, below is my revised code

HTML Page.
<a href="/expotr/expotr?action=com.expd.webapp.tracking.action.document.DocumentImageDownloadWebAction&amp;Type=Image.Get&amp;xref=iwJBOhDL0VuA%2BQ21J9Sd9rdyZB2f3grZmFQyVhweT2UfCmJZ7l6a0g%3D%3D" target="_blank">Commercial Invoice</a>

Open in new window




Option Explicit
#If VBA7 Then
        Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr) As LongPtr
#Else
        Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As Long, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As Long, _
        ByVal lpfnCB As Long) As Long
#End If

Public Sub Login()

Dim ie As New InternetExplorerMedium
Dim HTMLDoc As MSHTML.HTMLDocument
Dim Links As MSHTML.IHTMLElementCollection
Dim Docs As MSHTML.IHTMLElement
Dim Link As MSHTML.IHTMLElement
Dim FileURL As String
Dim DestinationFile As String


    'Dim ie As New InternetExplorer               'InternetExplorerMedium
    With ie
        .Visible = True
        .navigate "https://portal.expeditors.com/expo/login"

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document

            With .getElementById("j_username")
                .Focus
                .Value = "bom-sumand"
            End With
            With .getElementById("j_password")
                .Focus
                .Value = "****"
            End With
            .getElementById("signInBtn").Click
        End With
        
       While .Busy Or .readyState < 4: DoEvents: Wend
 
 'End With
       'With ie
         .navigate "https://expo.expeditors.com/expotr/expotr?action=com.expd.webapp.tracking.action.searches.SearchWebAction&SearchType=advancedshipment.BaseAdvancedShipmentSearch&reference=4750422330#Document_Images"
         .Visible = True
         
         While .Busy Or .readyState < 4: DoEvents: Wend
         
         Application.Wait (Now + TimeValue("0:00:10"))
                 
         Set HTMLDoc = ie.document
         
         Set Docs = HTMLDoc.getElementsByClassName("sorting_1")(4)
         Set Links = Docs.getElementsByTagName("a")
            Debug.Print Links.Length
         
            For Each Link In Links
                If Link.innerText = "Commercial Invoice" Then
                    Debug.Print Link.innerText, Link.getAttribute("href")
                    FileURL = Link.getAttribute("href")
                    MsgBox FileURL
                End If
            Next Link
        
            DestinationFile = "C:\VBA\4750422330.PDF"
            If URLDownloadToFile(0, FileURL, DestinationFile, 0, 0) = 0 Then
                Debug.Print "file download started"
            Else
                Debug.Print "file download not started"
            End If
      End With
End Sub

Open in new window

PDF-Error.PNG
Avatar of Kimputer
Kimputer

Sadly, you have to get used to using Fiddler. Using it correctly would definitely contradict your statement that everything you capture during code looks the same as using a browser manually.
Your code will never show you the important stuff I told you about earlier (cookies, headers, to and from the server, etc)
You should already know your code doesn't reflect your manual browser actions, as the resulting download is different.
Since you adjusted the code as I suggested, it's seems slightly less likely a cookie issue, but some internal rerouting issues (i.e. you're only working with the final URL, while really in the background many transactions happened before it, that the browser doesn't show you, but a Fiddler capture would).
ASKER CERTIFIED SOLUTION
Avatar of Suman Devadiga
Suman Devadiga
Flag of India image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of Kimputer
Kimputer

Sometimes this happens, depends on combination website with browser. In same cases changing it to one of these options might help (might need a few other lines to be changed as well)

Dim ie As New SHDocVw.InternetExplorer
Dim ie As New MSXML2.XMLHTTP60
Avatar of Suman Devadiga
Suman Devadiga
Flag of India image

ASKER

Hello Kimputer,

I tried these code however i getting "Automation error"  

Thank you for all your suggestion, this really help me to check all possible area , one thing i want to tell you that there are several forum which give advise on code however their mostly advise is just one liner but you have given me so many option or advise for one error, thank you once again,
VBA
VBA

Visual Basic for Applications (VBA) enables building user-defined functions (UDFs), automating processes and accessing Windows API and other low-level functionality through dynamic-link libraries (DLLs). VBA is closely related to Visual Basic and uses the Visual Basic Runtime Library, but it can normally only run code within a host application rather than as a standalone program. It can, however, be used to control one application from another via OLE Automation. VBA is built into most Microsoft Office applications.

17K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo