Solved

Collating PDF form data

Posted on 2013-06-05
2
441 Views
Last Modified: 2013-06-17
Hi All,

Please help!
I have created a PDF sign off sheet for subcontractors to sign off walls of each room in a building as they complete them. This all works fine.
The problem is for 1 job there may be 30 - 40 sign off sheets and I need to be able to extract the date that each wall was signed off and collate it preferably in an excel sheet but I would be willing to settle for another PDF or even MS Access. I just need it to work easily without too much user intervention.
The fields in the PDF are all labelled EG. there is SignA1 DateA1, Sign A2, Date A2 and so on.
I do not need to extract anything except the dates as everything else can be set up as a template, the subcontractors and walls do not change throughout a job.
So just to be clear I would need to extract the dates out of each of the 30-40 pdfs, and have them on one sheets each PDF is named after a room or area eg A1.16, A1.17
This is quite an urgent matter as the sign off sheets are now being used and I need a quick and easy method to check what has and what hasn't been signed off.

Thought it might be worth mentioning that this was created in Bluebeam not Adobe, I have found VBA that can extract data from PDF but a full version of Adobe Acrobat is required for it to work and the user may not have it.

Thank you in advance for your help
Janine
0
Comment
Question by:RobJanine
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 44

Accepted Solution

by:
Karl Heinz Kremer earned 500 total points
ID: 39225549
Before I came to the last paragraph of your question, I would have suggested a solution based on Acrobat... So, let's start from scratch and see if I can come up with something that works outside of Acrobat. The free pdftk tool (http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) allows you to extract form data from one document at a time (it's a command line tool, so you can write a batch script that will process all of your files at the same time. The problem is that this will output the data in the FDF file format. Here is the command line you would use:

pdftk ./test.pdf generate_fdf output test.fdf

You would then have to parse the FDF file to get the field contents (it's a pretty simple file format).

If you want a more integrated solution, and you can program in Java (or any of the .NET languages), you could write your own application using the iText or iTextSharp PDF library:

http://itextpdf.com

Unfortunately, there is nothing that would do what you need out of the box. You need to do some programming to get the information out of the files.

I've done projects like this for some of my customers, and if this is a one time job, it may be easier (and faster) to just extract the data manually.

Hope this helps.
0
 

Author Comment

by:RobJanine
ID: 39233721
Hi thanks for your help, as this will be used on every building project we do going forward I think it best to get it right I will look into programming something.
Thanks
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article demonstrates how to create a simple responsive confirmation dialog with Ok and Cancel buttons using HTML, CSS, jQuery and Promises
This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
Learn how to make your own table of contents in Microsoft Word using paragraph styles and the automatic table of contents tool. We'll be using the paragraph styles in Word’s Home toolbar to help you create a table of contents. Type out your initial …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question