Solved

Collating PDF form data

Posted on 2013-06-05
2
427 Views
Last Modified: 2013-06-17
Hi All,

Please help!
I have created a PDF sign off sheet for subcontractors to sign off walls of each room in a building as they complete them. This all works fine.
The problem is for 1 job there may be 30 - 40 sign off sheets and I need to be able to extract the date that each wall was signed off and collate it preferably in an excel sheet but I would be willing to settle for another PDF or even MS Access. I just need it to work easily without too much user intervention.
The fields in the PDF are all labelled EG. there is SignA1 DateA1, Sign A2, Date A2 and so on.
I do not need to extract anything except the dates as everything else can be set up as a template, the subcontractors and walls do not change throughout a job.
So just to be clear I would need to extract the dates out of each of the 30-40 pdfs, and have them on one sheets each PDF is named after a room or area eg A1.16, A1.17
This is quite an urgent matter as the sign off sheets are now being used and I need a quick and easy method to check what has and what hasn't been signed off.

Thought it might be worth mentioning that this was created in Bluebeam not Adobe, I have found VBA that can extract data from PDF but a full version of Adobe Acrobat is required for it to work and the user may not have it.

Thank you in advance for your help
Janine
0
Comment
Question by:RobJanine
2 Comments
 
LVL 44

Accepted Solution

by:
Karl Heinz Kremer earned 500 total points
ID: 39225549
Before I came to the last paragraph of your question, I would have suggested a solution based on Acrobat... So, let's start from scratch and see if I can come up with something that works outside of Acrobat. The free pdftk tool (http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) allows you to extract form data from one document at a time (it's a command line tool, so you can write a batch script that will process all of your files at the same time. The problem is that this will output the data in the FDF file format. Here is the command line you would use:

pdftk ./test.pdf generate_fdf output test.fdf

You would then have to parse the FDF file to get the field contents (it's a pretty simple file format).

If you want a more integrated solution, and you can program in Java (or any of the .NET languages), you could write your own application using the iText or iTextSharp PDF library:

http://itextpdf.com

Unfortunately, there is nothing that would do what you need out of the box. You need to do some programming to get the information out of the files.

I've done projects like this for some of my customers, and if this is a one time job, it may be easier (and faster) to just extract the data manually.

Hope this helps.
0
 

Author Comment

by:RobJanine
ID: 39233721
Hi thanks for your help, as this will be used on every building project we do going forward I think it best to get it right I will look into programming something.
Thanks
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft Office Picture Manager was included in Office 2003, 2007, and 2010, but not in Office 2013. Users had hopes that it would be in Office 2016/Office 365, but it is not. Fortunately, the same zero-cost technique that works to install it with …
Outlook Free & Paid Tools
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

832 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question