Extract Pages from PDF file with Acrobat Professional

Posted on 2011-04-28
Last Modified: 2012-05-11

I am just starting a project that I think is going to be a real pain in the #$%%.   I was hoping someone could point me in the right direction or give me a link.  I have a large PDF file containing many 2-page policies.  I have a report of these policy numbers and what page the first page is in the PDF file.  

My task is to create individual PDF files for eacy policy with their unique policy number as the file name.  

I have thousands of polices to extract for input to another system.  Any direction you can provide would be greatly appreciated.  Thanks,

Question by:joshcallahan1
    LVL 29

    Expert Comment

    by:Randy Downs

    Author Comment


    Thanks for the comment, these don't directly solve my problem but the same company puts out a spitter that may work with a little vbscripting to liven it up.  The program splits the specified page ranges, which I know is in intervals of 2.  Depending on the naming conventions that are used I may be able to rename the output files to match the file that I have. here is the command line from the program to split out the pages:

    "C:\Program Files\AdvancedReliableSoftware\AdvancedCommandLinePdfSplitter\aclpdfsplit.exe"
    -sds "C:\InputDirectory" "C:\OutputDirectory" 1-2, 3-4, 4-5, 6-7 

    Open in new window

    If Ican rename the outputing using my file with the index and usinge VBscript I may be in business, in the mean time any other comments are welcome as this work around does require a purchase and extra work and multiple steps.

    LVL 2

    Accepted Solution

    The best library out there for pdf manipulation is the iText one or the iTextSharp .NEt offshoot.
    The documentation is extensive but you may need to translate stuff from the original Java language iText project to work with .NET

    The code below isn't perfect, but should get you going towards your solution.

    public static void splitMyPDF(String inputpdf, String outputFolder) {
    	try {
    		PdfReader reader = new PdfReader(inputpdf);
    		//Loop through
    		for (int i = 0; i < YOURPAGECOUNTVARIABLEHERE; i=i+2) {
    			extractSubDocToPDF(reader, i, i+2, "PAGE" + i + ".pdf", outputFolder);
    	catch (IOException e) {
    private static void extractSubDocToPDF(PdfReader reader, int pageFrom, int pageTo, String outputName, String outputFolder) {
    	Document document = new Document();
    	try {
    		// Create a writer for the outputstream
    		PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(outputFolder + outputName, FileMode.Create));
    		PdfContentByte cb = writer.DirectContent; // Holds the PDF data
    		PdfImportedPage page;
    		while (pageFrom < pageTo) {
    			page = writer.GetImportedPage(reader, pageFrom);
    			cb.AddTemplate(page, 0, 0);
    		document.Close ();
    	catch (Exception ex) {
    	finally {
    		if (document.IsOpen())
    			document.Close ();

    Open in new window


    Author Comment


    Thanks for the comment sounds like you known a bunch about Javascript.  I'm really green on it, I'm going to try (when I get time), to create an Access database to create the batch file commands to split the files.  If that doesn't work I'll try out your example.


    LVL 31

    Expert Comment

    by:James Murrell
    This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    6 Surprising Benefits of Threat Intelligence

    All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

    PaperPort is a popular document imaging/management product from Nuance Communications ( It is in widespread use by both individuals ( and businesses (http:/…
    PDF files have been in the limelight due to its unmatched features.  Personal documents, emails, business reports and eBooks are all converted into PDF files owing to peerless features provided by it. Adding watermark to a PDF file is a method to se…
    Sometimes we receive PDF files that are in the wrong orientation. They may be sideways or even upside down. This most commonly happens with scanned or faxed documents. It is possible to rotate the view of these PDFs with the free Adobe Reader produc…
    We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

    758 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    9 Experts available now in Live!

    Get 1:1 Help Now