?
Solved

Collating PDF form data

Posted on 2013-06-05
2
Medium Priority
?
447 Views
Last Modified: 2013-06-17
Hi All,

Please help!
I have created a PDF sign off sheet for subcontractors to sign off walls of each room in a building as they complete them. This all works fine.
The problem is for 1 job there may be 30 - 40 sign off sheets and I need to be able to extract the date that each wall was signed off and collate it preferably in an excel sheet but I would be willing to settle for another PDF or even MS Access. I just need it to work easily without too much user intervention.
The fields in the PDF are all labelled EG. there is SignA1 DateA1, Sign A2, Date A2 and so on.
I do not need to extract anything except the dates as everything else can be set up as a template, the subcontractors and walls do not change throughout a job.
So just to be clear I would need to extract the dates out of each of the 30-40 pdfs, and have them on one sheets each PDF is named after a room or area eg A1.16, A1.17
This is quite an urgent matter as the sign off sheets are now being used and I need a quick and easy method to check what has and what hasn't been signed off.

Thought it might be worth mentioning that this was created in Bluebeam not Adobe, I have found VBA that can extract data from PDF but a full version of Adobe Acrobat is required for it to work and the user may not have it.

Thank you in advance for your help
Janine
0
Comment
Question by:RobJanine
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 44

Accepted Solution

by:
Karl Heinz Kremer earned 2000 total points
ID: 39225549
Before I came to the last paragraph of your question, I would have suggested a solution based on Acrobat... So, let's start from scratch and see if I can come up with something that works outside of Acrobat. The free pdftk tool (http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) allows you to extract form data from one document at a time (it's a command line tool, so you can write a batch script that will process all of your files at the same time. The problem is that this will output the data in the FDF file format. Here is the command line you would use:

pdftk ./test.pdf generate_fdf output test.fdf

You would then have to parse the FDF file to get the field contents (it's a pretty simple file format).

If you want a more integrated solution, and you can program in Java (or any of the .NET languages), you could write your own application using the iText or iTextSharp PDF library:

http://itextpdf.com

Unfortunately, there is nothing that would do what you need out of the box. You need to do some programming to get the information out of the files.

I've done projects like this for some of my customers, and if this is a one time job, it may be easier (and faster) to just extract the data manually.

Hope this helps.
0
 

Author Comment

by:RobJanine
ID: 39233721
Hi thanks for your help, as this will be used on every building project we do going forward I think it best to get it right I will look into programming something.
Thanks
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Ready to improve network connectivity? Watch this webinar to learn how SD-WANs and a one-click instant connect tool can boost provisions, deployment, and management of your cloud connection.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

You need to know the location of the Office templates folder, so that when you create new templates, they are saved to that location, and thus are available for selection when creating new documents.  The steps to find the Templates folder path are …
Access developers frequently have requirements to interact with Excel (import from or output to) in their applications.  You might be able to accomplish this with the TransferSpreadsheet and OutputTo methods, but in this series of articles I will di…
Learn how to create and modify your own paragraph styles in Microsoft Word. This can be helpful when wanting to make consistently referenced styles throughout a document or template.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question