how to get a blank page from pdf files?

abdul hameed
abdul hameed used Ask the Experts™
We have requirement to find if there is any blank/empty pages in a PDF files. Actually there are 4 million PDF files which needs to be validated for above condition and also there will be 10k-12k pages in a PDF. Hence need a script to automate this work.

Thanks in Advance!

OS:- Windows
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Using iText, here is code taken from

The code below deletes the blank pages, you will probably just need to add the name of the file and the page number in some form of log.


import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfCopy;
import com.itextpdf.text.pdf.PdfDictionary;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfName;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.RandomAccessFileOrArray;

public class RemoveBlankPageFromPDF {

    // value where we can consider that this is a blank image
    // can be much higher or lower depending of what is considered as a blank page
    public static final int BLANK_THRESHOLD = 160;

    public static void removeBlankPdfPages(String source, String destination)
        throws IOException, DocumentException
        PdfReader r = null;
        RandomAccessSourceFactory rasf = null;
        RandomAccessFileOrArray raf = null;
        Document document = null;
        PdfCopy writer = null;

        try {
            r = new PdfReader(source);
            // deprecated
            //    RandomAccessFileOrArray raf
            //           = new RandomAccessFileOrArray(pdfSourceFile);
            // itext 5.4.1
            rasf = new RandomAccessSourceFactory();
            raf = new RandomAccessFileOrArray(rasf.createBestSource(source));
            document = new Document(r.getPageSizeWithRotation(1));
            writer = new PdfCopy(document, new FileOutputStream(destination));
            PdfImportedPage page = null;

            for (int i=1; i<=r.getNumberOfPages(); i++) {
                // first check, examine the resource dictionary for /Font or
                // /XObject keys.  If either are present -> not blank.
                PdfDictionary pageDict = r.getPageN(i);
                PdfDictionary resDict = (PdfDictionary) pageDict.get( PdfName.RESOURCES );
                boolean noFontsOrImages = true;
                if (resDict != null) {
                  noFontsOrImages = resDict.get( PdfName.FONT ) == null &&
                                    resDict.get( PdfName.XOBJECT ) == null;
                System.out.println(i + " noFontsOrImages " + noFontsOrImages);

                if (!noFontsOrImages) {
                    byte bContent [] = r.getPageContent(i,raf);
                    ByteArrayOutputStream bs = new ByteArrayOutputStream();
                      (i + bs.size() + " > BLANK_THRESHOLD " +  (bs.size() > BLANK_THRESHOLD));
                    if (bs.size() > BLANK_THRESHOLD) {
                        page = writer.getImportedPage(r, i);
        finally {
            if (document != null) document.close();
            if (writer != null) writer.close();
            if (raf != null) raf.close();
            if (r != null) r.close();

    public static void main (String ... args) throws Exception {
            ("C://temp//documentwithblank.pdf", "C://temp//documentwithnoblank.pdf");

Open in new window



is there any other way to do the same?
Joe WinogradDeveloper
Fellow 2017
Most Valuable Expert 2018
What is your definition of a "blank/empty" page? For example, attached are five, one-page PDFs, as follows:

(1) created from a Word file with nothing in it

(2) created from a Word file with some tabs and spaces in it

(3) created from a Word file with a footer that has a page number in it, but nothing else

(4) created by a scanner that scanned at 300 DPI in black&white — it is visually "blank/empty"

(5) created by a scanner that scanned at 200 DPI in color — it is visually "blank/empty"

Which of these do you consider "blank/empty"? Regards, Joe
Joe WinogradDeveloper
Fellow 2017
Most Valuable Expert 2018

Dan's code provides a solution based on size being the criterion for a blank/empty page, while Joe's post discusses important issues regarding the very definition of a blank/empty page. Most of the credit to Dan, some to Joe.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial