Febin Mathew
asked on
Delete PDF pages based on conditions
I am looking for a macro to delete the pages from a PDF file based on certain conditions in an excel file. I have few pdf files in a particular folder patch. Macro has to look into those files and delete few pages based in the condition. See attached excel.Column B has the PDF file name. Column C, D & E has conditions. If column C says "No" for a PDF file page 5 should be deleted. Similarly for column D & E. Thanks for your help.
Test.xlsm
Test.xlsm
ASKER
Thanks Joe. I uploaded the attachment.
Test.xlsm
Test.xlsm
You cannot delete pages from a pdf. You would need to re-create the pdf excluding the pages.
Where are the pdfs created from?
Where are the pdfs created from?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I got the below code from another site. It can delete the pages for one particular file, not for a list of files. Not sure how to edit it for a list.
Sub Delete_PDF_Pages()
' Adobe code based on http://vbcity.com/forums/t/51200.aspx
Dim xMsg As String
Dim xInput As String
Dim xOutput As String
Dim xResponse As Long
Dim xLast_Row As Long
Dim xErrors As Long
Dim xDeleted As Long
Dim i As Long
Dim j As Long
Dim AcroApp As CAcroApp
Dim AcroPDDoc As CAcroPDDoc
Dim AcroHiliteList As CAcroHiliteList
Dim AcroTextSelect As CAcroPDTextSelect
Dim xarray() As Variant
Dim PageNumber As Variant
Dim PageContent As Variant
Dim xContent As Variant
xInput = "C:\Users\Febin.Mathew\Des ktop\CA 3522\FM testing\CA 568 SMLLC\TestPages.pdf"
xOutput = "C:\Users\Febin.Mathew\Des ktop\CA 3522\FM testing\CA 568 SMLLC\TestPages_Output.pdf "
xLast_Row = [A1].SpecialCells(xlLastCe ll).Row
ReDim xarray(xLast_Row)
xResponse = MsgBox("About to delete all pages which contain values from the range A1:A" & xLast_Row & Chr(10) _
& Chr(10) & "Input:" & Chr(9) & xInput _
& Chr(10) & "Output:" & Chr(9) & xOutput _
& Chr(10) & Chr(10) & "('OK' to continue, 'Cancel' to quit.)", vbOKCancel, "Delete Pages")
If xResponse = 2 Then
MsgBox "User chose not to continue. Run terminated."
Exit Sub
End If
' Files and data OK?
If Dir(xInput) = "" Then xMsg = "Input file not found - " & xInput & Chr(10)
If Dir(xOutput) <> "" Then xMsg = "Output file exists - " & xOutput & Chr(10)
xarray = Application.Transpose(Rang e("A1:A" & xLast_Row))
For i = 1 To xLast_Row
If Not IsNumeric(xarray(i)) Or xarray(i) = "" Then
xMsg = "Non-numeric ""Delete"" value of """ & xarray(i) & """ found on row " & i & Chr(10)
Exit For
End If
Next
If xMsg <> "" Then
MsgBox (xMsg & Chr(10) & "Run cancelled.")
Exit Sub
End If
' Open the PDF...
Set AcroApp = CreateObject("AcroExch.App ")
Set AcroPDDoc = CreateObject("AcroExch.PDD oc")
If AcroPDDoc.Open(xInput) <> True Then
MsgBox (xInput & " couldn't be opened - run cancelled.")
Exit Sub
End If
' Read each page...
For i = AcroPDDoc.GetNumPages - 1 To 0 Step -1
Set PageNumber = AcroPDDoc.AcquirePage(i)
Set PageContent = CreateObject("AcroExch.Hil iteList")
'Get up to 9,999 words from page...
If PageContent.Add(0, 9999) <> True Then
Debug.Print "Add Error on Page " & i + 1
xErrors = xErrors + 1
Else
Set AcroTextSelect = PageNumber.CreatePageHilit e(PageCont ent)
If Not AcroTextSelect Is Nothing Then
xContent = ""
For j = 0 To AcroTextSelect.GetNumText - 1
xContent = xContent & AcroTextSelect.GetText(j)
Next j
For j = 1 To xLast_Row
If InStr(1, xContent, xarray(j)) > 0 Then
Debug.Print "Page " & i + 1 & " contains " & xarray(j) & " - " & xContent
' To avoid problems with the delete...
Set AcroTextSelect = Nothing
Set PageContent = Nothing
Set PageNumber = Nothing
If AcroPDDoc.DeletePages(i, i) = False Then
MsgBox ("Error deleting page " & i + 1 & " - run cancelled.")
Exit Sub
End If
xDeleted = xDeleted + 1
Exit For
End If
Next
End If
End If
Next i
If AcroPDDoc.Save(PDSaveFull, xOutput) = False Then
MsgBox "Cannot save the modified document"
Exit Sub
Else
MsgBox (xDeleted & " pages deleted. (" & xErrors & " errors.)")
End If
AcroPDDoc.Close
AcroApp.Exit
End Sub
Sub Delete_PDF_Pages()
' Adobe code based on http://vbcity.com/forums/t/51200.aspx
Dim xMsg As String
Dim xInput As String
Dim xOutput As String
Dim xResponse As Long
Dim xLast_Row As Long
Dim xErrors As Long
Dim xDeleted As Long
Dim i As Long
Dim j As Long
Dim AcroApp As CAcroApp
Dim AcroPDDoc As CAcroPDDoc
Dim AcroHiliteList As CAcroHiliteList
Dim AcroTextSelect As CAcroPDTextSelect
Dim xarray() As Variant
Dim PageNumber As Variant
Dim PageContent As Variant
Dim xContent As Variant
xInput = "C:\Users\Febin.Mathew\Des
xOutput = "C:\Users\Febin.Mathew\Des
xLast_Row = [A1].SpecialCells(xlLastCe
ReDim xarray(xLast_Row)
xResponse = MsgBox("About to delete all pages which contain values from the range A1:A" & xLast_Row & Chr(10) _
& Chr(10) & "Input:" & Chr(9) & xInput _
& Chr(10) & "Output:" & Chr(9) & xOutput _
& Chr(10) & Chr(10) & "('OK' to continue, 'Cancel' to quit.)", vbOKCancel, "Delete Pages")
If xResponse = 2 Then
MsgBox "User chose not to continue. Run terminated."
Exit Sub
End If
' Files and data OK?
If Dir(xInput) = "" Then xMsg = "Input file not found - " & xInput & Chr(10)
If Dir(xOutput) <> "" Then xMsg = "Output file exists - " & xOutput & Chr(10)
xarray = Application.Transpose(Rang
For i = 1 To xLast_Row
If Not IsNumeric(xarray(i)) Or xarray(i) = "" Then
xMsg = "Non-numeric ""Delete"" value of """ & xarray(i) & """ found on row " & i & Chr(10)
Exit For
End If
Next
If xMsg <> "" Then
MsgBox (xMsg & Chr(10) & "Run cancelled.")
Exit Sub
End If
' Open the PDF...
Set AcroApp = CreateObject("AcroExch.App
Set AcroPDDoc = CreateObject("AcroExch.PDD
If AcroPDDoc.Open(xInput) <> True Then
MsgBox (xInput & " couldn't be opened - run cancelled.")
Exit Sub
End If
' Read each page...
For i = AcroPDDoc.GetNumPages - 1 To 0 Step -1
Set PageNumber = AcroPDDoc.AcquirePage(i)
Set PageContent = CreateObject("AcroExch.Hil
'Get up to 9,999 words from page...
If PageContent.Add(0, 9999) <> True Then
Debug.Print "Add Error on Page " & i + 1
xErrors = xErrors + 1
Else
Set AcroTextSelect = PageNumber.CreatePageHilit
If Not AcroTextSelect Is Nothing Then
xContent = ""
For j = 0 To AcroTextSelect.GetNumText - 1
xContent = xContent & AcroTextSelect.GetText(j)
Next j
For j = 1 To xLast_Row
If InStr(1, xContent, xarray(j)) > 0 Then
Debug.Print "Page " & i + 1 & " contains " & xarray(j) & " - " & xContent
' To avoid problems with the delete...
Set AcroTextSelect = Nothing
Set PageContent = Nothing
Set PageNumber = Nothing
If AcroPDDoc.DeletePages(i, i) = False Then
MsgBox ("Error deleting page " & i + 1 & " - run cancelled.")
Exit Sub
End If
xDeleted = xDeleted + 1
Exit For
End If
Next
End If
End If
Next i
If AcroPDDoc.Save(PDSaveFull,
MsgBox "Cannot save the modified document"
Exit Sub
Else
MsgBox (xDeleted & " pages deleted. (" & xErrors & " errors.)")
End If
AcroPDDoc.Close
AcroApp.Exit
End Sub
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Sorry, I can't help you with that code, but there are many VB/VBA experts here at EE, so let's hope some of them jump into the thread. Note that the code makes COM calls to Acrobat's API/SDK, e.g., CreateObject("AcroExch.App "), CreateObject("AcroExch.PDD oc"), CreateObject("AcroExch.Hil iteList"), and subsequent calls with those objects, such as AcroPDDoc.Open, AcroPDDoc.GetNumPages, AcroPDDoc.AcquirePage, etc. Regards, Joe
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks Roy. But it did not work for me. I am just a beginner with VB coding.it is throwing some error for the below code.
Also can we change the path to a particular folder where the excel workbook is saved?
Delete_PDF_Pages fPath & rCl.Value
Also can we change the path to a particular folder where the excel workbook is saved?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Joe, I'm wondering if the code requires Acrobat installed, not simply Acrobat Reader. I haven't worked with Acrobat for a lomg while
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
That's what I thought.
Febin have you got Acrobat installed?
Febin have you got Acrobat installed?
ASKER
Yes Roy. I have adobe acrobat.
I am also getting an error for the objects which Joe was referring to.
Set AcroApp = CreateObject("AcroExch.App ")
Set AcroPDDoc = CreateObject("AcroExch.PDD oc")
I am also getting an error for the objects which Joe was referring to.
Set AcroApp = CreateObject("AcroExch.App
Set AcroPDDoc = CreateObject("AcroExch.PDD
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Before posting, I tested that the code that I added worked without the Acrobat part. I don't have Acrobat Pro and I think the code requires that.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I have acrobat pro XI
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Since we haven't heard from Febin in two weeks, I'm going to take a run at closing this. There are numerous helpful posts that I think will be valuable for other EE members with a similar issue to find in the PAQ. There is no "Best" answer in this case, so I selected the first helpful post as the Accepted Solution and all the others as Assisted Solutions, but I split the points evenly between the two participating experts. Regards, Joe
First, I see that you joined Experts Exchange today, so let me say — Welcome Aboard! Second, the Excel attachment is missing. Regards, Joe