Link to home
Start Free TrialLog in
Avatar of Nirvana
NirvanaFlag for India

asked on

copy text in pdf and add to filename

I have some pdf copies from which i have to copy name of the city and add it as part of filename.

in the attached file "adelaide" have to be added to the file name.
Invoice_Template.pdf
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

I would do this by writing a program that calls the PDF Toolkit (PDFtk), an excellent (free) product that has numerous features to manipulate PDFs. It comes in both command line and GUI versions. The command line version is called PDFtk Server and may be downloaded here:
http://www.pdflabs.com/tools/pdftk-server/

Don't be misled by "Server" in the name. I don't know why they called it that, but it's just an executable (pdftk.exe, with a supporting DLL, libiconv2.dll) that runs on "regular" Windows, i.e., it does not have to run on a "server" Windows.

It has a command line operation called generate_fdf that generates an FDF file from a PDF. It is very easy to pull data from an FDF file because it is plain text with a simple structure. Here's a simplified line of code (from a program that I wrote for a client) that shows how to call it:

pdftk.exe "%PDFform%" generate_fdf output "%FDFfile%"

The variable %PDFform% contains the file name of your PDF form and %FDFfile% contains the file name of the generated FDF file. Then the program would get the value in the City field and do a file rename (although in the file you posted, "adelaide" is in the Address Line 2 field — I presume that's simply an error in filling out the form). I attached the FDF file (as a TXT file) that PDFtk generated from the file you posted.

If you'd like to see the full syntax for the PDFtk command line and some usage examples, here are the links:
http://www.pdflabs.com/docs/pdftk-man-page/
http://www.pdflabs.com/docs/pdftk-cli-examples/

You can write the program in any language that is able to make a command line call. The one that I mentioned above (that I wrote for a client) was coded in AutoHotkey, an excellent (free) macro/scripting/programming language. Regards, Joe
invoice_template_FDF.txt
Avatar of Nirvana

ASKER

Hi Joe, thank you very much for the solution. Actually the original file looks different from the earlier file that i have shared. attaching the original file. blurred some sensitive data. in the attached file in the description i have it as "Melbourne" that is what i have to pick and update in file name. because these are not in the form format. not sure how to pick these.

BR
Uday
ASKER CERTIFIED SOLUTION
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Nirvana

ASKER

Hi Joe, thank you for the solution. i have converted that to to image however it would be in the pdf format. i will download xpdf and see if i can rename based on the content of the file
Hi Uday,
You're welcome. Yes, the conversion from an image-only PDF to a searchable PDF still leaves you with a file in PDF format. That's where Xpdf's PDFtoText comes in — it will create a plain text file with just the textual content from the PDF that you'll be able to search easily with a program and then use the text in the program to do a file rename. Regards, Joe