Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 890
  • Last Modified:

how to count pages of a pdf via php

I have a printer client, who wants to count the pages automatically of uploaded pdf files via a php script (the number found will be incorporated in their on-line estimator). What would be the best way to do this. I know there is a page count in pdf, but don't have enough php experience to know how to grab the number. Client says it must be php. Thanks
0
atgdesign
Asked:
atgdesign
  • 14
  • 12
1 Solution
 
EMB01Commented:
The attached code will find the number of pages in a PDF document via PHP.
<?php
        //where $file is the full path to your PDF document. 
        if(file_exists($file)) { 
                        //open the file for reading 
            if($handle = @fopen($file, "rb")) { 
                $count = 0; 
                $i=0; 
                while (!feof($handle)) { 
                    if($i > 0) { 
                        $contents .= fread($handle,8152); 
                    } 
                    else { 
                          $contents = fread($handle, 1000); 
                        //In some pdf files, there is an N tag containing the number of 
                        //of pages. This doesn't seem to be a result of the PDF version. 
                        //Saves reading the whole file. 
                        if(preg_match("/\/N\s+([0-9]+)/", $contents, $found)) { 
                            return $found[1]; 
                        } 
                    } 
                    $i++; 
                } 
                fclose($handle); 
  
                //get all the trees with 'pages' and 'count'. the biggest number 
                //is the total number of pages, if we couldn't find the /N switch above.                 
                if(preg_match_all("/\/Type\s*\/Pages\s*.*\s*\/Count\s+([0-9]+)/", $contents, $capture, PREG_SET_ORDER)) { 
                    foreach($capture as $c) { 
                        if($c[1] > $count) 
                            $count = $c[1]; 
                    } 
                    return $count;             
                } 
            } 
        } 
        return 0; 
?>

Open in new window

0
 
atgdesignAuthor Commented:
Admittedly, I just may not have enough understanding, but I couldn't get it to work.

I tried on my machine using MAMP and for the line that says
        //where $file is the full path to your PDF document.
I used  

$file = "ill_Instructions.pdf";

also tried

$file = "//ill_Instructions.pdf";

also

$file = "http://localhost:8888/count/ill_Instructions.pd;

Do I need some additional code to display the results, or would the

return $count;   display the count to the screen.

Sorry I am just getting into php, so maybe just don't have enough of the basics to spot my lack.

THanks
0
 
EMB01Commented:
It's no problem. I'll have you try using a similar variable for $file. I've also added error reporting. Let me know what happens, please. Thank you.
<?php
// error reporting
ini_set('display_errors','1');
error_reporting(E_ALL);
// if your file is at http://localhost:8888/count/ill_Instructions.pdf
$file = "/count/ill_Instructions.pdf";
        //where $file is the full path to your PDF document. 
        if(file_exists($file)) { 
                        //open the file for reading 
            if($handle = @fopen($file, "rb")) { 
                $count = 0; 
                $i=0; 
                while (!feof($handle)) { 
                    if($i > 0) { 
                        $contents .= fread($handle,8152); 
                    } 
                    else { 
                          $contents = fread($handle, 1000); 
                        //In some pdf files, there is an N tag containing the number of 
                        //of pages. This doesn't seem to be a result of the PDF version. 
                        //Saves reading the whole file. 
                        if(preg_match("/\/N\s+([0-9]+)/", $contents, $found)) { 
                            return $found[1]; 
                        } 
                    } 
                    $i++; 
                } 
                fclose($handle); 
  
                //get all the trees with 'pages' and 'count'. the biggest number 
                //is the total number of pages, if we couldn't find the /N switch above.                 
                if(preg_match_all("/\/Type\s*\/Pages\s*.*\s*\/Count\s+([0-9]+)/", $contents, $capture, PREG_SET_ORDER)) { 
                    foreach($capture as $c) { 
                        if($c[1] > $count) 
                            $count = $c[1]; 
                    } 
                    return $count;             
                } 
            } 
        } 
        return 0; 
?>

Open in new window

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
EMB01Commented:
By the way, where is this file on the server? If it is at http://localhost:8888/, there shouldn't be any problems.
0
 
atgdesignAuthor Commented:
Well, it may be something I am doing wrong, right now the file is

http://localhost:8888/count/index.php

with the ill_Instructions.pdf file in the same directory

I pasted your code in to the index.php file, and blank screen in firefox 3.

I also put it on my server,  http://atgdesign.com/count/index.php, with the pasting your code into the index.php file, blank screen no error message.

ill_Instructions.pdf in same directory

http://atgdesign.com/count/ill_Instructions.pdf

Am I doing something wrong?

Thanks
0
 
EMB01Commented:
Can you try this combination (with $file at http://localhost:8888/ill_Instructions.pdf)?
<?php
// error reporting
ini_set('display_errors','1');
error_reporting(E_ALL);
// if your file is at http://localhost:8888/ill_Instructions.pdf
$file = "/ill_Instructions.pdf";
        //where $file is the full path to your PDF document. 
        if(file_exists($file)) { 
                        //open the file for reading 
            if($handle = @fopen($file, "rb")) { 
                $count = 0; 
                $i=0; 
                while (!feof($handle)) { 
                    if($i > 0) { 
                        $contents .= fread($handle,8152); 
                    } 
                    else { 
                          $contents = fread($handle, 1000); 
                        //In some pdf files, there is an N tag containing the number of 
                        //of pages. This doesn't seem to be a result of the PDF version. 
                        //Saves reading the whole file. 
                        if(preg_match("/\/N\s+([0-9]+)/", $contents, $found)) { 
                            return $found[1]; 
                        } 
                    } 
                    $i++; 
                } 
                fclose($handle); 
  
                //get all the trees with 'pages' and 'count'. the biggest number 
                //is the total number of pages, if we couldn't find the /N switch above.                 
                if(preg_match_all("/\/Type\s*\/Pages\s*.*\s*\/Count\s+([0-9]+)/", $contents, $capture, PREG_SET_ORDER)) { 
                    foreach($capture as $c) { 
                        if($c[1] > $count) 
                            $count = $c[1]; 
                    } 
                    return $count;             
                } 
            } 
        } 
        return 0; 
?>

Open in new window

0
 
atgdesignAuthor Commented:
Still a blank screen. not sure what's up
0
 
EMB01Commented:
I'm going to run a live test on my server with a .PDF. Back in a moment.
0
 
EMB01Commented:
Okay, I changed the script a little. I turned it into a function and added a conditional. This way, you can simply echo the function like:
echo _countLinesInPDF('ill_Instructions.pdf');

And, it will either count the lines or echo "No Lines in .PDF," assuming there are no lines read and the function returns FALSE.

Make sure the .PDF document is in the same folder as the script.

Please let me know if you have any questions.
<?php
function _countLinesInPDF($file) {
        if(file_exists($file)) { 
                        //open the file for reading 
            if($handle = @fopen($file, "rb")) { 
                $count = 0; 
                $i=0; 
                while (!feof($handle)) { 
                    if($i > 0) { 
                        $contents .= fread($handle,8152); 
                    } 
                    else { 
                          $contents = fread($handle, 1000); 
                        //In some pdf files, there is an N tag containing the number of 
                        //of pages. This doesn't seem to be a result of the PDF version. 
                        //Saves reading the whole file. 
                        if(preg_match("/\/N\s+([0-9]+)/", $contents, $found)) { 
                            return $found[1]; 
                        } 
                    } 
                    $i++; 
                } 
                fclose($handle); 
  
                //get all the trees with 'pages' and 'count'. the biggest number 
                //is the total number of pages, if we couldn't find the /N switch above.                 
                if(preg_match_all("/\/Type\s*\/Pages\s*.*\s*\/Count\s+([0-9]+)/", $contents, $capture, PREG_SET_ORDER)) { 
                    foreach($capture as $c) { 
                        if($c[1] > $count) 
                            $count = $c[1]; 
                    } 
                    return $count;             
                } 
            } 
        } 
        return 0; 
}
// conditional
if (_countLinesInPDF('buyback_form.pdf')) {
	echo _countLinesInPDF('buyback_form.pdf');
} else {
	echo "No Lines in .PDF.";
}
?>

Open in new window

0
 
EMB01Commented:
Here, I changed the script from using my file, to using your file, so that you don't have to mess with anything. You can just copy and paste this one.
<?php
function _countLinesInPDF($file) {
        if(file_exists($file)) { 
                        //open the file for reading 
            if($handle = @fopen($file, "rb")) { 
                $count = 0; 
                $i=0; 
                while (!feof($handle)) { 
                    if($i > 0) { 
                        $contents .= fread($handle,8152); 
                    } 
                    else { 
                          $contents = fread($handle, 1000); 
                        //In some pdf files, there is an N tag containing the number of 
                        //of pages. This doesn't seem to be a result of the PDF version. 
                        //Saves reading the whole file. 
                        if(preg_match("/\/N\s+([0-9]+)/", $contents, $found)) { 
                            return $found[1]; 
                        } 
                    } 
                    $i++; 
                } 
                fclose($handle); 
  
                //get all the trees with 'pages' and 'count'. the biggest number 
                //is the total number of pages, if we couldn't find the /N switch above.                 
                if(preg_match_all("/\/Type\s*\/Pages\s*.*\s*\/Count\s+([0-9]+)/", $contents, $capture, PREG_SET_ORDER)) { 
                    foreach($capture as $c) { 
                        if($c[1] > $count) 
                            $count = $c[1]; 
                    } 
                    return $count;             
                } 
            } 
        } 
        return 0; 
}
// conditional
if (_countLinesInPDF('ill_Instructions.pdf')) {
	_countLinesInPDF('ill_Instructions.pdf');
} else {
	echo "No Lines in .PDF.";
}
?>

Open in new window

0
 
atgdesignAuthor Commented:
I hope I am not doing something wrong, but loaded it both at http://localhost:8888/count/index.php and at http://atgdesign.com/count/index.php, with the pdf in the same directory and still a black browser screen.

It is PHP Version 5.2.8, not a clue what's up

Thoughts?
0
 
EMB01Commented:
It works on my server at:
http://relogistechs.com/tech_test.php

Try checking permissions on the file to see if you can read it. You can do this with:
if (is_readable('ill_Instructions.pdf')) {
print "The file is readable.";
} else {
print "The file is NOT readable.";
}
<?php
// check permissions
if (is_readable('ill_Instructions.pdf')) {
print "The file is readable.";
} else {
print "The file is NOT readable.";
}
function _countLinesInPDF($file) {
        if(file_exists($file)) { 
                        //open the file for reading 
            if($handle = @fopen($file, "rb")) { 
                $count = 0; 
                $i=0; 
                while (!feof($handle)) { 
                    if($i > 0) { 
                        $contents .= fread($handle,8152); 
                    } 
                    else { 
                          $contents = fread($handle, 1000); 
                        //In some pdf files, there is an N tag containing the number of 
                        //of pages. This doesn't seem to be a result of the PDF version. 
                        //Saves reading the whole file. 
                        if(preg_match("/\/N\s+([0-9]+)/", $contents, $found)) { 
                            return $found[1]; 
                        } 
                    } 
                    $i++; 
                } 
                fclose($handle); 
  
                //get all the trees with 'pages' and 'count'. the biggest number 
                //is the total number of pages, if we couldn't find the /N switch above.                 
                if(preg_match_all("/\/Type\s*\/Pages\s*.*\s*\/Count\s+([0-9]+)/", $contents, $capture, PREG_SET_ORDER)) { 
                    foreach($capture as $c) { 
                        if($c[1] > $count) 
                            $count = $c[1]; 
                    } 
                    return $count;             
                } 
            } 
        } 
        return 0; 
}
// conditional
if (_countLinesInPDF('ill_Instructions.pdf')) {
	_countLinesInPDF('ill_Instructions.pdf');
} else {
	echo "No Lines in .PDF.";
}
?>

Open in new window

0
 
atgdesignAuthor Commented:
says

The file is readable.

http://atgdesign.com/count/
0
 
atgdesignAuthor Commented:
I see yours returns a count
0
 
EMB01Commented:
How many pages are actually in your .PDF?
0
 
atgdesignAuthor Commented:
3
0
 
EMB01Commented:
You are running this code:
// conditional
if (_countLinesInPDF('ill_Instructions.pdf')) {
      _countLinesInPDF('ill_Instructions.pdf');
} else {
      echo "No Lines in .PDF.";
}

And, the screen is blank! That doesn't make sense. The function either evaluates to TRUE or FALSE.

I'll try to think of some other ideas to get this working on your server for your file.

Currently, I'm not sure what the problem is... Sorry.
0
 
EMB01Commented:
As we know, the script works on my server. So, maybe it's a problem with the .PDF file. Try a different .PDF if you get a chance. If it doesn't work with the new .PDF, the problem must be with the server.
0
 
atgdesignAuthor Commented:
Going to try a new pdf in a minute
0
 
atgdesignAuthor Commented:
http://atgdesign.com/count/

Changed pdfs

added a slash before name

// conditional
if (_countLinesInPDF('/MediaKit.pdf')) {
      _countLinesInPDF('/MediaKit.pdf');
} else {
      echo "No Lines in .PDF.";

22 page pdf, but says

The file is readable.No Lines in .PDF.

Server is at Hostgator, PHP Version 5.2.8


http://atgdesign.com/test.php for info

seems like we are getting closer, appreciate all the effort, not sure what it might be
0
 
atgdesignAuthor Commented:
Interesting, I grabbed your pdf and it works (my pdfs were both fairly graphics heavy, not sure that matters)

If you have a chance, grab this file and try it

http://atgdesign.com/count/ill_Instructions.pdf

It would be really good to know why this and another of my test pdfs dont work, but I think your counter looks really good
0
 
atgdesignAuthor Commented:
Now it works, who knows what I did wrong, thanks, you probably deserve 1000 points, Thanks for all the help
0
 
EMB01Commented:
I'll have to mess with this either a little later tonight or tomorrow if the following doesn't work:

- Changed changed fread functions to use total file size

P.S. This isn't my script, I just grabbed it from the internet and modified it accordingly.

I think this will work...
<?php
// check permissions
if (is_readable('ill_Instructions.pdf')) {
print "The file is readable.";
} else {
print "The file is NOT readable.";
}
function _countLinesInPDF($file) {
        if(file_exists($file)) { 
                        //open the file for reading 
            if($handle = @fopen($file, "rb")) { 
                $count = 0; 
                $i=0; 
                while (!feof($handle)) { 
                    if($i > 0) { 
                        $contents .= fread($handle,filesize($file)); 
                    } 
                    else { 
                          $contents = fread($handle, filesize($file)); 
                        //In some pdf files, there is an N tag containing the number of 
                        //of pages. This doesn't seem to be a result of the PDF version. 
                        //Saves reading the whole file. 
                        if(preg_match("/\/N\s+([0-9]+)/", $contents, $found)) { 
                            return $found[1]; 
                        } 
                    } 
                    $i++; 
                } 
                fclose($handle); 
  
                //get all the trees with 'pages' and 'count'. the biggest number 
                //is the total number of pages, if we couldn't find the /N switch above.                 
                if(preg_match_all("/\/Type\s*\/Pages\s*.*\s*\/Count\s+([0-9]+)/", $contents, $capture, PREG_SET_ORDER)) { 
                    foreach($capture as $c) { 
                        if($c[1] > $count) 
                            $count = $c[1]; 
                    } 
                    return $count;             
                } 
            } 
        } 
        return 0; 
}
// conditional
if (_countLinesInPDF('ill_Instructions.pdf')) {
	_countLinesInPDF('ill_Instructions.pdf');
} else {
	echo "No Lines in .PDF.";
}
?>

Open in new window

0
 
EMB01Commented:
Actually, the files may be too big (since you by your own admission state the files your using are graphic intensive). I just read that fread stops reading after a maximum of 8192 bytes have been read (after opening userspace stream).

Ref. http://us.php.net/fread

We may have to try to find another function to use or something...
0
 
EMB01Commented:
It works now! That's great. I'm glad I could help.
0
 
atgdesignAuthor Commented:
Thanks a million
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 14
  • 12
Tackle projects and never again get stuck behind a technical roadblock.
Join Now