Solved

Acrobat Professional Delete pages based on Excel File

Posted on 2013-01-04
7
1,470 Views
Last Modified: 2013-01-08
Hi,

I have a 2,000 page PDF and I have adobe professional.  I need to delete pages out of this file if they contain a number which is found in an MS Excel file.  

The MS excel file is 2003 format. It contins a list of numbers. Any page in the PDF file with one of these numbers needs to be deleted.

I'm Guessing this is only possible with Javascript which I am mildly familiar with, so I have no idea how to do this.  

Thanks for your help.

Josh
0
Comment
Question by:joshcallahan1
  • 3
  • 3
7 Comments
 
LVL 7

Expert Comment

by:whosbetterthanme
ID: 38744682
This article can help you:

http://commonsenseatheism.com/?p=8335
0
 

Author Comment

by:joshcallahan1
ID: 38745208
Not quite what I mean.  I think there may be several ways to accomplish this.

-Print the excel sheet (which is a list of numbers) into pdf and use it to create an index to the pages.  I don't know how to do that.

-I think then it would be possible to delete all pages with bookmark desinations on them.

I'm not sure how to do it but this would be my best guess.  If someone could send me some javascript for this it would be great.  

Or send another Idea on how to solve this one. Thansk =)
0
 
LVL 26

Accepted Solution

by:
redmondb earned 500 total points
ID: 38748552
Hi, joshcallahan1.

Please see attached. You'll need to change the Adobe reference and also the code to point to the input and output files. I've not done much with PDF's in VBA so please be careful!

The code is...
Option Explicit

Sub Delete_PDF_Pages()
' Adobe code based on http://vbcity.com/forums/t/51200.aspx
Dim xMsg           As String
Dim xInput         As String
Dim xOutput        As String
Dim xResponse      As Long
Dim xLast_Row      As Long
Dim xErrors        As Long
Dim xDeleted       As Long
Dim i              As Long
Dim j              As Long
Dim AcroApp        As CAcroApp
Dim AcroPDDoc      As CAcroPDDoc
Dim AcroHiliteList As CAcroHiliteList
Dim AcroTextSelect As CAcroPDTextSelect
Dim xarray()       As Variant
Dim PageNumber     As Variant
Dim PageContent    As Variant
Dim xContent       As Variant

xInput = "D:\TestPages.pdf"
xOutput = "D:\TestPages_Output.pdf"

xLast_Row = [A1].SpecialCells(xlLastCell).Row
ReDim xarray(xLast_Row)

xResponse = MsgBox("About to delete all pages which contain values from the range A1:A" & xLast_Row & Chr(10) _
            & Chr(10) & "Input:" & Chr(9) & xInput _
            & Chr(10) & "Output:" & Chr(9) & xOutput _
            & Chr(10) & Chr(10) & "('OK' to continue, 'Cancel' to quit.)", vbOKCancel, "Delete Pages")
If xResponse = 2 Then
    MsgBox "User chose not to continue. Run terminated."
    Exit Sub
End If

' Files and data OK?
If Dir(xInput) = "" Then xMsg = "Input file not found - " & xInput & Chr(10)
If Dir(xOutput) <> "" Then xMsg = "Output file exists - " & xOutput & Chr(10)
xarray = Application.Transpose(Range("A1:A" & xLast_Row))
For i = 1 To xLast_Row
    If Not IsNumeric(xarray(i)) Or xarray(i) = "" Then
        xMsg = "Non-numeric ""Delete"" value of """ & xarray(i) & """ found on row " & i & Chr(10)
        Exit For
    End If
Next
If xMsg <> "" Then
    MsgBox (xMsg & Chr(10) & "Run cancelled.")
    Exit Sub
End If

' Open the PDF...
Set AcroApp = CreateObject("AcroExch.App")
Set AcroPDDoc = CreateObject("AcroExch.PDDoc")
If AcroPDDoc.Open(xInput) <> True Then
    MsgBox (xInput & " couldn't be opened - run cancelled.")
    Exit Sub
End If

' Read each page...
For i = AcroPDDoc.GetNumPages - 1 To 0 Step -1

    Set PageNumber = AcroPDDoc.AcquirePage(i)
    Set PageContent = CreateObject("AcroExch.HiliteList")

    'Get up to 9,999 words from page...
    If PageContent.Add(0, 9999) <> True Then
        
        Debug.Print "Add Error on Page " & i + 1
        xErrors = xErrors + 1
    
    Else

        Set AcroTextSelect = PageNumber.CreatePageHilite(PageContent)
    
        If Not AcroTextSelect Is Nothing Then
            xContent = ""
            For j = 0 To AcroTextSelect.GetNumText - 1
                xContent = xContent & AcroTextSelect.GetText(j)
            Next j
            For j = 1 To xLast_Row
                If InStr(1, xContent, xarray(j)) > 0 Then
                    Debug.Print "Page " & i + 1 & " contains " & xarray(j) & " - " & xContent
                    ' To avoid problems with the delete...
                    Set AcroTextSelect = Nothing
                    Set PageContent = Nothing
                    Set PageNumber = Nothing
                    If AcroPDDoc.DeletePages(i, i) = False Then
                        MsgBox ("Error deleting page " & i + 1 & " - run cancelled.")
                        Exit Sub
                    End If
                    xDeleted = xDeleted + 1
                    Exit For
                End If
             Next
        End If
        
    End If

Next i

If AcroPDDoc.Save(PDSaveFull, xOutput) = False Then
    MsgBox "Cannot save the modified document"
    Exit Sub
Else
    MsgBox (xDeleted & " pages deleted. (" & xErrors & " errors.)")
End If
    
AcroPDDoc.Close
AcroApp.Exit

End Sub

Open in new window

Regards,
Brian.Delete-PDF-Pages.xlsTestPages.pdf
0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 

Author Closing Comment

by:joshcallahan1
ID: 38751692
Wow!! just ran this as a macro out of Excel after changing the source PDF files and it worked on the first try.  Best answer I've gotten on EE so far.  Thanks!!
0
 
LVL 26

Expert Comment

by:redmondb
ID: 38751964
Thanks, joshcallahan1, glad to help!
0
 

Author Comment

by:joshcallahan1
ID: 38755977
Oh I forgot to mention that in the VBA editor for the Excel file I had to go to Tools>References> and I clicked on the the various Adobe options, it won't work without that.
0
 
LVL 26

Expert Comment

by:redmondb
ID: 38757576
Thanks, joshcallahan1. I'd cryptically mentioned that (You'll need to change the Adobe reference), but your post is much clearer for anyone coming across this.
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Excel Formula 16 48
JS does not refresh 6 21
Need excel formula correction. 5 16
Javascript the "if condition with Or" 8 21
This article will guide you to convert a grid from a picture into Excel format using Microsoft OneNote and no other 3rd party application.
Excel can be a tricky bit of software to get your head around. Whilst you’ll be able to eventually get to grips with the basic understanding of how to get by, there are a few Excel tips that not everybody will even know about let alone know how to d…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
In a recent question (https://www.experts-exchange.com/questions/28997919/Pagination-in-Adobe-Acrobat.html) here at Experts Exchange, a member asked how to add page numbers to a PDF file using Adobe Acrobat XI Pro. This short video Micro Tutorial sh…

825 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question