Link to home
Start Free TrialLog in
Avatar of Scott Fell
Scott FellFlag for United States of America

asked on

Bates Stamp: Programmatically Add Unique Numbers To Footer Of A PDF

I want to generate unique numbers for every page on a pdf file.

In sudo code it would look something like

next_number = 393456
job_code = "abc"
file = "some_document.pdf"
total_pages = pageCount(file)

For each page in file
     next_number = next_number +1
     addFooter(job_code + " "+next_number)
next

Open in new window


Most likely this would be command line (windows).  Right now, what I have found is https://www.pdflabs.com/docs/pdftk-man-page/.  Reading the docs, it looks like I need to first find out how many pages are in the pdf, then generate a blank pdf with just a footer and add the unique "next_number" to every page. Then merge this document with the document I want to add the footer to.

What I am looking for are other command line tools available where I can add something unique to every page with hopefully some type of free tool where I can do this programatically. Are there better options than pdftk?
Avatar of Bill Prew
Bill Prew

I can't personally speak to it, but I tool I have heard Joe Winograd mention in prior posts is below, might be worth a look...

https://www.tracker-software.com/product/pdf-tools


»bp
Avatar of Scott Fell

ASKER

Thanks Bill.  I need to be able to use this via code. I looked at the site and the sdk's they offer are in the $1000+ range. I know there are options that are less or even free.

I found some more info on https://www.pdflabs.com (pdftk) and it may fit the bill. I am reviewing that now. As example, I found some info here https://www.autohotkey.com/boards/viewtopic.php?t=62393 and http://hildstrom.com/projects/bates-number-a-pdf/  Both do something similar. One happens to  use auto hot key and the other is a C program both using pdftk as the engine.
Hi Scott,

I have a proposal for you. I've written many programs in the PDF space. I have a developer license (btw, expensive!) for an excellent, commercial PDF library that my programs call. Most of my programs in the PDF space are for automation, typically operating on an entire folder of PDF files, usually with an option to recurse into subfolders to an unlimited depth, although there's often an option to operate on an individual file (or several selected files). Here's an EE thread with an example of one such program:
https://www.experts-exchange.com/questions/29052626/Is-there-a-way-to-count-pages-in-a-group-of-PDF's-and-create-a-report-in-a-separate-PDF.html

I'm building a portfolio of programs that I plan to sell in the future. I often get ideas for these programs from posts here at EE, such as the one shown above and this one from you. So, I'm thinking that a program that is able to place a Bates number in the header or footer of every page in a PDF file would be a nice addition to my portfolio.

Now to the details of my proposal:

(1) The program will work only in Windows, probably just W7, W8.1, and W10 (32-bit and 64-bit), but maybe XP and/or Vista.

(2) No Adobe software, or any third-party software, will be required, i.e., it will be a stand-alone, self-contained program.

(3) It will have a Graphical User Interface (GUI) and a Command Line Interface (CLI). The latter will be suitable for calling from a batch file, command prompt, program, script, etc.

(4) The program will come with a standard Windows installer (a Setup.exe file) to install the program. It will also have a standard uninstaller that will cleanly uninstall it via Control Panel>Programs and Features.

(5) It will come with a Quick Start Guide along the lines of the one posted at the EE thread mentioned above.

(6) I will not provide source code. The only deliverables will be the installer (a Setup.exe file) and the Quick Start Guide (a PDF file).

(7) I will provide you with a free license for the program. It will be for your personal use only. Among other provisions in its EULA that you'll have to agree to when installing, you may not redistribute it. In return for the free license, you agree to (a) help in design by suggesting specifications for its features and functions; (b) test the program during its development; and (c) provide feedback until it is debugged...so that I may add it to my ready-for-sale portfolio. :)

Does this proposal interest you? Regards, Joe
Joe, thank you for the offer and I will always be happy to help guide you on end user needs.

I am looking for an option to program this myself which is why I gave some pseudo code that explains my needs. There are already some open source and paid options that I have found. For pdftk I couldn't wrap my head around how to use it at first. Namely you have to create a second pdf file to overlay on top of the original. There is no way to create a pdf using that tool for instance. But the work around is creating a single page pdf and then insert pages as needed for the overlay.  Finding out how many pages you need is as simple as
pdftk out.pdf dump_data output data.txt

Open in new window

You end up with a text file with easy to parse information that shows the number of pages, size and orientation of each page.

In the end, this question only touches on part of what I am actually working on. Linking and updating to a data source is the other part.

I may just piece some questions together as I get stuck and use pdftk.  

My question here is, what other alternatives are available that have at a windows command line api or dll.
ASKER CERTIFIED SOLUTION
Avatar of Bill Prew
Bill Prew

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
> I will always be happy to help guide you on end user needs.

Thanks...much appreciated!

> pdftk

I used pdftk a lot in the past...a search here at EE in My Personal KnowledgeBase for pdftk gets 89 hits. :) One of the most common uses is dump_data and it does work well, although for page count, I prefer the Xpdf tool called PDFinfo.

> Namely you have to create a second pdf file to overlay on top of the original.

Yes, I presume you're talking about the background/multibackground and stamp/multistamp operations. Problem is, you would have to create a gazillion PDFs, each with the unique page number to use as the overlay (well, not a gazillion, but the maximum page number that you'll need). That's why a "real" Bates feature is the way to go.

> what other alternatives are available that have at a windows command line api or dll.

Here are two ideas for you:

(1) Coherent PDF Community Release
http://community.coherentpdf.com/

This is the free edition. Note this bullet point there:
• Stamp logos, text, dates, page numbers
It looks promising, but I don't know if the Community version will do what you want, because this is the commercial library that I mentioned above...the one that I now use in my programs (and I've never used the free one):
https://www.coherentpdf.com/

Here's the price list:
https://www.coherentpdf.com/prices-2020.pdf

Note this in there:
Developer License

A developer license allows you to build our software royalty-free into the back end of your own software which will be accessed by an unlimited number of people. It also includes a free site license for your own use. The cost is $8500.
(2) Sejda:
http://sejda.org/

Note this comment:
Adds header or footer text with page numbers, text labels or bates numbering to PDF documents
As you can see there, the Open Source version is free.

Also, scroll down to see the download for this:
Try sejda-console, our open source command line interface
I've played with Sejda over the years but never used it for real, so I don't know if it can do what you want...worth a spin, though. Regards, Joe
I was playing with C code and C#.It looks like I am going to go with pdfsharp.
As an update, I have since played with about 13 different libraries where I created a simple app to open a pdf, add content and save it as a new file just for testing.  There are a lot of options in this space and some easier to use than others as far as coding goes. Another factor going into choice is the licensing. The paid products range in price from $300 to $3000+. Most have an open source option with either limiting restrictions or a GPL/LPGL/APGL license that I was not keen about.  PDFSharp was very enticing as it has an MIT license and the functionality I need for this is not requiring very many bells and whistles as far as pdf options go.  I did run into a glitch with PDFSharp that had to do with adding content when the page is rotated 270 degrees. This happens on a multifunction device that allows you to scan portrait or landscape. Scanning and 8.5 X 11 portrait or landscape outputs the same landscape page.  Visually they look the same but underlying the pdf is marked as being rotated 270.  When using PDFSharp, the added content gets rotated what appears to be -90 where other libraries added new content as expected. It took some trial and error to get the rotation back where it should be in that scenario.  With all that said, PDFSharp is what I will be using for this app.

I was really looking for some code to build on what I have in my question or at least a start of something the last couple of weeks playing with all different libraries I was able to figure out the coding on my own in several different languages.

A quick comment on the libraries suggested in this thread.

https://itextpdf.com/en/products/itext-7 - Open source license being APGL is too restrictive for what I want to do. For the commercial license, when you go to the site and they do not show any pricing, that is a good warning sign.  They wanted something like $1400 for a developer license plus another $2800 for the end user license. There are other licensing options. If you are going to develop an actual open source project, it is a good option.  If you are developing for enterprise, I am sure the amount of money they are asking is small.But for in between, this does not fit.

https://www.pdflabs.com/ pdfkit started out promising. The free version is GPL and the commercial version is $1000.  That seems a bit high for limited functionality.  I would have like to see this in the $200 to $300 range.

https://www.tracker-software.com the commercial license for server was about $5k - no.

https://www.coherentpdf.com - Open source is LPGL and the commercial license at $500 is ok, but that is for each server and the license seemed limiting.

http://www.pdfsharp.net/ - You can't beat an MIT license. The cost, however, is documentation is not clear and takes a lot of hunting around the code or forums.  Once you get the learning curve down, for my needs it was worth the little extra effort.