Link to home
Start Free TrialLog in
Avatar of Suman Devadiga
Suman DevadigaFlag for India

asked on

Website download adobe reader could not open PDF because it is either not a supported file type or because the file has been damaged

Hello Team,

Am trying to login company tracking website and download documents from file in C drive, while opening downloaded documents from C drive I am getting error i.e.

Adobe Reader could not open pdf because it is either not a supported file type or because the file has been damaged

Can you advise is there any other way to download PDF file from website?


Option Explicit
#If VBA7 Then
        Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr) As LongPtr
#Else
        Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As Long, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As Long, _
        ByVal lpfnCB As Long) As Long
#End If

Sub DowloadSingleFile()
    Dim FileURL As String
    Dim DestinationFile As String

    FileURL = "https://expo.expeditors.com/expotr/expotr?action=com.expd.webapp.tracking.action.document.DocumentImageDownloadWebAction&Type=Image.Get&xref=IVP9OABdjwlQyfAP4d8VNzKeBtYse1TlCWJmb5gzyMxbwbgxBLP19w%3D%3D"

    DestinationFile = "C:\VBA\4750422330.pdf"

    If URLDownloadToFile(0, FileURL, DestinationFile, 0, 0) = 0 Then
        Debug.Print "file download started"
    Else
        Debug.Print "file download not started"
    End If

End Sub

Open in new window

Avatar of Kimputer
Kimputer

Code would've worked on simple websites. This site requires login, and therefore, you downloaded the login page, which Adobe Reader obviously won't display for you.
For ANY login page, you need FAR FAR more code (probably 10-fold of your current code), as first you have to investigate the login procedure, including fully understanding the traffic to and from, with minute details like which cookies to send back etc etc etc.
Very very laborous work, including the need to totally rewrite it whenever they decide to slightly change the login procedure.
Avatar of Suman Devadiga

ASKER

Hello

I have whole list of script which login to page and download the documents however i am stuck in downloading part instead of documents script downloading html page.


Option Explicit
#If VBA7 Then
        Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr) As LongPtr
#Else
        Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As Long, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As Long, _
        ByVal lpfnCB As Long) As Long
#End If

Public Sub Login()

Dim ie As New InternetExplorerMedium
Dim HTMLDoc As MSHTML.HTMLDocument
Dim Links As MSHTML.IHTMLElementCollection
Dim Docs As MSHTML.IHTMLElement
Dim Link As MSHTML.IHTMLElement
Dim FileURL As String
Dim DestinationFile As String


    'Dim ie As New InternetExplorer               'InternetExplorerMedium
    With ie
        .Visible = True
        .navigate "https://portal.expeditors.com/expo/login"

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document

            With .getElementById("j_username")
                .Focus
                .Value = "bom-sumand"
            End With
            With .getElementById("j_password")
                .Focus
                .Value = "******"
            End With
            .getElementById("signInBtn").Click
        End With
        
       While .Busy Or .readyState < 4: DoEvents: Wend
 End With
 
       With ie
         .navigate "https://expo.expeditors.com/expotr/expotr?action=com.expd.webapp.tracking.action.searches.SearchWebAction&SearchType=advancedshipment.BaseAdvancedShipmentSearch&reference=4750422330#Document_Images"
         .Visible = True
         
         While .Busy Or .readyState < 4: DoEvents: Wend
         
         Application.Wait (Now + TimeValue("0:00:10"))
                 
         Set HTMLDoc = ie.document
         
         Set Docs = HTMLDoc.getElementsByClassName("sorting_1")(4)
         Set Links = Docs.getElementsByTagName("a")
            Debug.Print Links.Length
         
            For Each Link In Links
                If Link.innerText = "Commercial Invoice" Then
                    Debug.Print Link.innerText, Link.getAttribute("href")
                    FileURL = Link.getAttribute("href")
                    MsgBox FileURL
                End If
            Next Link
        End With
            DestinationFile = "C:\VBA\4750422330.PDF"
            If URLDownloadToFile(0, FileURL, DestinationFile, 0, 0) = 0 Then
                Debug.Print "file download started"
            Else
                Debug.Print "file download not started"
            End If
End Sub

Open in new window

Sadly, I can't help you without a valid login. I suggest your capture ALL the data you can during the code, and compare it to the capture you make while manually logging in and downloading the pdf file.
(Use Fiddler4 for that)
Could be that you shouldn't end the With statement (closing the login form, might actually delete the cookies required for the download), and you should put the download code inside the first login With section. Could be totally something else (sometimes you can't directly go the the final link, but you need to go to intermediary links for the final link becomes "valid".
Comparing the 2 captures will be quite tedious. You might see some POST and GETS are different from manual download and your code.
Not sure how to use Fiddler however i have captured all the data during code and compared it to the capture data manually, but i don't see any difference

Secondly Download code i have put inside with statement, i have tried all possibilities still getting attached error, below is my revised code

HTML Page.
<a href="/expotr/expotr?action=com.expd.webapp.tracking.action.document.DocumentImageDownloadWebAction&amp;Type=Image.Get&amp;xref=iwJBOhDL0VuA%2BQ21J9Sd9rdyZB2f3grZmFQyVhweT2UfCmJZ7l6a0g%3D%3D" target="_blank">Commercial Invoice</a>

Open in new window




Option Explicit
#If VBA7 Then
        Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr) As LongPtr
#Else
        Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As Long, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As Long, _
        ByVal lpfnCB As Long) As Long
#End If

Public Sub Login()

Dim ie As New InternetExplorerMedium
Dim HTMLDoc As MSHTML.HTMLDocument
Dim Links As MSHTML.IHTMLElementCollection
Dim Docs As MSHTML.IHTMLElement
Dim Link As MSHTML.IHTMLElement
Dim FileURL As String
Dim DestinationFile As String


    'Dim ie As New InternetExplorer               'InternetExplorerMedium
    With ie
        .Visible = True
        .navigate "https://portal.expeditors.com/expo/login"

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document

            With .getElementById("j_username")
                .Focus
                .Value = "bom-sumand"
            End With
            With .getElementById("j_password")
                .Focus
                .Value = "****"
            End With
            .getElementById("signInBtn").Click
        End With
        
       While .Busy Or .readyState < 4: DoEvents: Wend
 
 'End With
       'With ie
         .navigate "https://expo.expeditors.com/expotr/expotr?action=com.expd.webapp.tracking.action.searches.SearchWebAction&SearchType=advancedshipment.BaseAdvancedShipmentSearch&reference=4750422330#Document_Images"
         .Visible = True
         
         While .Busy Or .readyState < 4: DoEvents: Wend
         
         Application.Wait (Now + TimeValue("0:00:10"))
                 
         Set HTMLDoc = ie.document
         
         Set Docs = HTMLDoc.getElementsByClassName("sorting_1")(4)
         Set Links = Docs.getElementsByTagName("a")
            Debug.Print Links.Length
         
            For Each Link In Links
                If Link.innerText = "Commercial Invoice" Then
                    Debug.Print Link.innerText, Link.getAttribute("href")
                    FileURL = Link.getAttribute("href")
                    MsgBox FileURL
                End If
            Next Link
        
            DestinationFile = "C:\VBA\4750422330.PDF"
            If URLDownloadToFile(0, FileURL, DestinationFile, 0, 0) = 0 Then
                Debug.Print "file download started"
            Else
                Debug.Print "file download not started"
            End If
      End With
End Sub

Open in new window

PDF-Error.PNG
Sadly, you have to get used to using Fiddler. Using it correctly would definitely contradict your statement that everything you capture during code looks the same as using a browser manually.
Your code will never show you the important stuff I told you about earlier (cookies, headers, to and from the server, etc)
You should already know your code doesn't reflect your manual browser actions, as the resulting download is different.
Since you adjusted the code as I suggested, it's seems slightly less likely a cookie issue, but some internal rerouting issues (i.e. you're only working with the final URL, while really in the background many transactions happened before it, that the browser doesn't show you, but a Fiddler capture would).
ASKER CERTIFIED SOLUTION
Avatar of Suman Devadiga
Suman Devadiga
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sometimes this happens, depends on combination website with browser. In same cases changing it to one of these options might help (might need a few other lines to be changed as well)

Dim ie As New SHDocVw.InternetExplorer
Dim ie As New MSXML2.XMLHTTP60
Hello Kimputer,

I tried these code however i getting "Automation error"  

Thank you for all your suggestion, this really help me to check all possible area , one thing i want to tell you that there are several forum which give advise on code however their mostly advise is just one liner but you have given me so many option or advise for one error, thank you once again,