Solved

Can you find number of pages on PDF without opening it?

Posted on 2006-07-16
4
1,529 Views
Last Modified: 2012-08-13
Hello,

Is there a way to find out how many pages a PDF is WITHOUT opening it?

I have found out that a bunch of scanned docs to PDF were scanned incorrectly and only the first page was scanned instead of the normal 3-5 pages.  We are talking about a couple of hundred possibly have been scanned incorrectly, out of a total of 900.  If I could find ALL of the PDF's that were only ONE page, then I could re-assign the task to the clerk and have those re-scanned.

Is this possible?

Thanks in advance for your answers!

Phil
0
Comment
Question by:pelampe
  • 2
  • 2
4 Comments
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 17121124
That depends on your definition of "opening"... You cannot determine the number of pages without opening the file - software needs to read the document in order to determine how many pages are in it. However, this can be done without manually opening the file in Acrobat (or Reader).

If you have the full version of Acrobat, you could run a batch sequence (assuming that all 900 documents are in one, or in a very limited number of directories). The batch sequence would run a JavaScript program to read the number of pages, and if it's 1, it would print a message to a file.

If you don't want to get into JavaScript (or don't have the full version of Acrobat), you could use the free pdftk (http://www.accesspdf.com/pdftk/) and use a batch script to run the program on all your files. If you use the "dump_data" command, it will report the number of pages in the document:

C:\temp>pdftk test.pdf dump_data
InfoKey: Creator
InfoValue: PScript5.dll Version 5.2
InfoKey: Title
InfoValue: Microsoft Word - test.doc
InfoKey: HDIG_ModDate
InfoValue: D:20060616141835-04'00'
InfoKey: Producer
InfoValue: GNU Ghostscript 7.05
InfoKey: Author
InfoValue: Noel
InfoKey: CreationDate--Text
InfoValue: 11/18/2005 15:17:16
InfoKey: ModDate
InfoValue: D:20060616141835-04'00'
InfoKey: CreationDate
InfoValue: 11/18/2005 15:17:16
PdfID0: 23d2e8ee9b1aab4c93252af6d89f57f6
PdfID1: 23d2e8ee9b1aab4c93252af6d89f57f6
NumberOfPages: 15


As you can see, the last line contains the number of pages in the document.

All you need to do is to come up with a script that can run this program on all files and evaluate the output line that starts with "NumberOfPages:".
0
 

Author Comment

by:pelampe
ID: 17124640
OK. I've downloaded the pdftk and it is working fine.  But how to write a batch program is well beyond my realm of knowledge.  Can you provide me with any hints or info as to how to go about it?

BTW, I do also have the FULL version of Acrobat 7, FWIW.
0
 
LVL 44

Accepted Solution

by:
Karl Heinz Kremer earned 200 total points
ID: 17124886
I can tell you how to do this with a batch sequence in Acrobat ( assuming that all the files are in one directory). I don't have any experience in Windows batch programming, so I'm not the best person to ask to come up with a script.

In Acrobat select Advanced>Batch PRocessing. On the new dialog that pops up click on the "New Sequence..." button. Specify a name (e.g. Report 1 Page Documents). On the new window click on the "Select Commands..." button. Scroll down to the "Execute JavaScript" element nad add that to the list of commands (either double-clicking, or select and use teh Add>> button). Double-click on the "Execute JavaScript" list item in the right half of the dialog. This will bring up an editor window. Paste the following code into that editor:

if (this.numPages == 1)
{
    console.show();
    console.println(this.path);
}

Click on "OK".
Click on "OK".

Select "Run commands on:" "Selected Folder" and click on the "Browse" button and specify the folder that contains the 900 documents (or a subset of those documents). If you have non-PDF files in that folder, you should click on the "Source File Options" button and deselect all other formats.

Select "Don't save changes" for the "Output Location".

Click on "OK".

Click on the "Run Sequence" button.
Click on "OK". -> This will start the sequence.
If at least one document with one page is found, the console window will open, and will contain all the file names that have only one page. Copy and paste that list into e.g. Notepad so that you can work on the list of files.
0
 

Author Comment

by:pelampe
ID: 17125127
Karl,

WOW!  Worked fantastic!

I am increasing your points too.

Thanks for your help!

Phil
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

In a previous article published here at Experts Exchange, Signature Image with Transparent Background (http://www.experts-exchange.com/Web_Development/Document_Imaging/A_12380-Signature-Image-with-Transparent-Background.html), I explained how to cre…
PaperPort is a popular document imaging/management product from Nuance Communications (http://www.nuance.com/). It is in widespread use by both individuals (http://www.nuance.com/for-individuals/by-product/paperport/index.htm) and businesses (http:/…
In this third video of the Xpdf series, we discuss and demonstrate the PDFtoText utility, which converts PDF files into plain text files. Download and install the software.: You may have already downloaded and installed the Xpdf tools while watching…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now