Solved

Can you find number of pages on PDF without opening it?

Posted on 2006-07-16
4
2,409 Views
Last Modified: 2012-08-13
Hello,

Is there a way to find out how many pages a PDF is WITHOUT opening it?

I have found out that a bunch of scanned docs to PDF were scanned incorrectly and only the first page was scanned instead of the normal 3-5 pages.  We are talking about a couple of hundred possibly have been scanned incorrectly, out of a total of 900.  If I could find ALL of the PDF's that were only ONE page, then I could re-assign the task to the clerk and have those re-scanned.

Is this possible?

Thanks in advance for your answers!

Phil
0
Comment
Question by:pelampe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 17121124
That depends on your definition of "opening"... You cannot determine the number of pages without opening the file - software needs to read the document in order to determine how many pages are in it. However, this can be done without manually opening the file in Acrobat (or Reader).

If you have the full version of Acrobat, you could run a batch sequence (assuming that all 900 documents are in one, or in a very limited number of directories). The batch sequence would run a JavaScript program to read the number of pages, and if it's 1, it would print a message to a file.

If you don't want to get into JavaScript (or don't have the full version of Acrobat), you could use the free pdftk (http://www.accesspdf.com/pdftk/) and use a batch script to run the program on all your files. If you use the "dump_data" command, it will report the number of pages in the document:

C:\temp>pdftk test.pdf dump_data
InfoKey: Creator
InfoValue: PScript5.dll Version 5.2
InfoKey: Title
InfoValue: Microsoft Word - test.doc
InfoKey: HDIG_ModDate
InfoValue: D:20060616141835-04'00'
InfoKey: Producer
InfoValue: GNU Ghostscript 7.05
InfoKey: Author
InfoValue: Noel
InfoKey: CreationDate--Text
InfoValue: 11/18/2005 15:17:16
InfoKey: ModDate
InfoValue: D:20060616141835-04'00'
InfoKey: CreationDate
InfoValue: 11/18/2005 15:17:16
PdfID0: 23d2e8ee9b1aab4c93252af6d89f57f6
PdfID1: 23d2e8ee9b1aab4c93252af6d89f57f6
NumberOfPages: 15


As you can see, the last line contains the number of pages in the document.

All you need to do is to come up with a script that can run this program on all files and evaluate the output line that starts with "NumberOfPages:".
0
 

Author Comment

by:pelampe
ID: 17124640
OK. I've downloaded the pdftk and it is working fine.  But how to write a batch program is well beyond my realm of knowledge.  Can you provide me with any hints or info as to how to go about it?

BTW, I do also have the FULL version of Acrobat 7, FWIW.
0
 
LVL 44

Accepted Solution

by:
Karl Heinz Kremer earned 200 total points
ID: 17124886
I can tell you how to do this with a batch sequence in Acrobat ( assuming that all the files are in one directory). I don't have any experience in Windows batch programming, so I'm not the best person to ask to come up with a script.

In Acrobat select Advanced>Batch PRocessing. On the new dialog that pops up click on the "New Sequence..." button. Specify a name (e.g. Report 1 Page Documents). On the new window click on the "Select Commands..." button. Scroll down to the "Execute JavaScript" element nad add that to the list of commands (either double-clicking, or select and use teh Add>> button). Double-click on the "Execute JavaScript" list item in the right half of the dialog. This will bring up an editor window. Paste the following code into that editor:

if (this.numPages == 1)
{
    console.show();
    console.println(this.path);
}

Click on "OK".
Click on "OK".

Select "Run commands on:" "Selected Folder" and click on the "Browse" button and specify the folder that contains the 900 documents (or a subset of those documents). If you have non-PDF files in that folder, you should click on the "Source File Options" button and deselect all other formats.

Select "Don't save changes" for the "Output Location".

Click on "OK".

Click on the "Run Sequence" button.
Click on "OK". -> This will start the sequence.
If at least one document with one page is found, the console window will open, and will contain all the file names that have only one page. Copy and paste that list into e.g. Notepad so that you can work on the list of files.
0
 

Author Comment

by:pelampe
ID: 17125127
Karl,

WOW!  Worked fantastic!

I am increasing your points too.

Thanks for your help!

Phil
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Abobe Reader XI - Print with page numbers 3 65
SharePoint 2013 open PDF in adobe 5 726
File names with & character 6 87
PDF files not opening in browser 18 51
The ability to edit PDF documents can be useful, however it may not be a straight forward process. Many non-technical people don't realise that a PDF document is basically an image rather than a text file, even if it contains nothing but text. If…
This article focuses on how to remove password security from multiple PDF files by Adobe Acrobat program. Sometimes it is essential to access the stored data items and to print, edit as well as copy content from Portable Document Format files in abs…
In this video, we show how to perform Bates Numbering/Stamping of PDF documents using Power PDF Advanced, the newest product from the Document Imaging division of Nuance Communications. There are two editions of Power PDF — Standard and Advanced. Th…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question