Link to home
Start Free TrialLog in
Avatar of Febin Mathew
Febin MathewFlag for India

asked on

Delete PDF pages based on conditions

I am looking for a macro to delete the pages from a PDF file based on certain conditions in an excel file. I have few pdf files in a particular folder patch. Macro has to look into those files and delete few pages based in the condition. See attached excel.Column B has the PDF file name. Column C, D & E has conditions. If column C says "No" for a PDF file page 5 should be deleted. Similarly for column D & E. Thanks for your help.
Test.xlsm
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Hi Febin,
First, I see that you joined Experts Exchange today, so let me say — Welcome Aboard! Second, the Excel attachment is missing. Regards, Joe
Avatar of Febin Mathew

ASKER

Thanks Joe. I uploaded the attachment.
Test.xlsm
You cannot delete pages from a pdf. You would need to re-create the pdf excluding the pages.

Where are the pdfs created from?
ASKER CERTIFIED SOLUTION
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I got the below code from another site. It can delete the pages for one particular file, not for a list of files. Not sure how to edit it for a list.

Sub Delete_PDF_Pages()
' Adobe code based on http://vbcity.com/forums/t/51200.aspx
Dim xMsg           As String
Dim xInput         As String
Dim xOutput        As String
Dim xResponse      As Long
Dim xLast_Row      As Long
Dim xErrors        As Long
Dim xDeleted       As Long
Dim i              As Long
Dim j              As Long
Dim AcroApp        As CAcroApp
Dim AcroPDDoc      As CAcroPDDoc
Dim AcroHiliteList As CAcroHiliteList
Dim AcroTextSelect As CAcroPDTextSelect
Dim xarray()       As Variant
Dim PageNumber     As Variant
Dim PageContent    As Variant
Dim xContent       As Variant

xInput = "C:\Users\Febin.Mathew\Desktop\CA 3522\FM testing\CA 568 SMLLC\TestPages.pdf"
xOutput = "C:\Users\Febin.Mathew\Desktop\CA 3522\FM testing\CA 568 SMLLC\TestPages_Output.pdf"

xLast_Row = [A1].SpecialCells(xlLastCell).Row
ReDim xarray(xLast_Row)

xResponse = MsgBox("About to delete all pages which contain values from the range A1:A" & xLast_Row & Chr(10) _
            & Chr(10) & "Input:" & Chr(9) & xInput _
            & Chr(10) & "Output:" & Chr(9) & xOutput _
            & Chr(10) & Chr(10) & "('OK' to continue, 'Cancel' to quit.)", vbOKCancel, "Delete Pages")
If xResponse = 2 Then
    MsgBox "User chose not to continue. Run terminated."
    Exit Sub
End If

' Files and data OK?
If Dir(xInput) = "" Then xMsg = "Input file not found - " & xInput & Chr(10)
If Dir(xOutput) <> "" Then xMsg = "Output file exists - " & xOutput & Chr(10)
xarray = Application.Transpose(Range("A1:A" & xLast_Row))
For i = 1 To xLast_Row
    If Not IsNumeric(xarray(i)) Or xarray(i) = "" Then
        xMsg = "Non-numeric ""Delete"" value of """ & xarray(i) & """ found on row " & i & Chr(10)
        Exit For
    End If
Next
If xMsg <> "" Then
    MsgBox (xMsg & Chr(10) & "Run cancelled.")
    Exit Sub
End If

' Open the PDF...
Set AcroApp = CreateObject("AcroExch.App")
Set AcroPDDoc = CreateObject("AcroExch.PDDoc")
If AcroPDDoc.Open(xInput) <> True Then
    MsgBox (xInput & " couldn't be opened - run cancelled.")
    Exit Sub
End If

' Read each page...
For i = AcroPDDoc.GetNumPages - 1 To 0 Step -1

    Set PageNumber = AcroPDDoc.AcquirePage(i)
    Set PageContent = CreateObject("AcroExch.HiliteList")

    'Get up to 9,999 words from page...
    If PageContent.Add(0, 9999) <> True Then
       
        Debug.Print "Add Error on Page " & i + 1
        xErrors = xErrors + 1
   
    Else

        Set AcroTextSelect = PageNumber.CreatePageHilite(PageContent)
   
        If Not AcroTextSelect Is Nothing Then
            xContent = ""
            For j = 0 To AcroTextSelect.GetNumText - 1
                xContent = xContent & AcroTextSelect.GetText(j)
            Next j
            For j = 1 To xLast_Row
                If InStr(1, xContent, xarray(j)) > 0 Then
                    Debug.Print "Page " & i + 1 & " contains " & xarray(j) & " - " & xContent
                    ' To avoid problems with the delete...
                    Set AcroTextSelect = Nothing
                    Set PageContent = Nothing
                    Set PageNumber = Nothing
                    If AcroPDDoc.DeletePages(i, i) = False Then
                        MsgBox ("Error deleting page " & i + 1 & " - run cancelled.")
                        Exit Sub
                    End If
                    xDeleted = xDeleted + 1
                    Exit For
                End If
             Next
        End If
       
    End If

Next i

If AcroPDDoc.Save(PDSaveFull, xOutput) = False Then
    MsgBox "Cannot save the modified document"
    Exit Sub
Else
    MsgBox (xDeleted & " pages deleted. (" & xErrors & " errors.)")
End If
   
AcroPDDoc.Close
AcroApp.Exit

End Sub
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sorry, I can't help you with that code, but there are many VB/VBA experts here at EE, so let's hope some of them jump into the thread. Note that the code makes COM calls to Acrobat's API/SDK, e.g., CreateObject("AcroExch.App"), CreateObject("AcroExch.PDDoc"), CreateObject("AcroExch.HiliteList"), and subsequent calls with those objects, such as AcroPDDoc.Open, AcroPDDoc.GetNumPages, AcroPDDoc.AcquirePage, etc. Regards, Joe
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks Roy. But it did not work for me. I am just a beginner with VB coding.it is throwing some error for the below code.

Delete_PDF_Pages fPath & rCl.Value

Open in new window


Also can we change the path to a particular folder where the excel workbook is saved?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Joe, I'm wondering if the code requires Acrobat installed, not simply Acrobat Reader. I haven't worked with Acrobat for a lomg while
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That's what I thought.

Febin have you got Acrobat installed?
Yes Roy. I have adobe acrobat.

I am also getting an error for the objects which Joe was referring to.

Set AcroApp = CreateObject("AcroExch.App")
Set AcroPDDoc = CreateObject("AcroExch.PDDoc")
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Before posting, I tested that the code that I added worked without the Acrobat part. I don't have Acrobat Pro and I think the code requires that.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I have acrobat pro XI
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Since we haven't heard from Febin in two weeks, I'm going to take a run at closing this. There are numerous helpful posts that I think will be valuable for other EE members with a similar issue to find in the PAQ. There is no "Best" answer in this case, so I selected the first helpful post as the Accepted Solution and all the others as Assisted Solutions, but I split the points evenly between the two participating experts. Regards, Joe