?
Solved

Adobe PDF Data to MS Access

Posted on 2013-12-05
2
Medium Priority
?
335 Views
Last Modified: 2013-12-25
This pdf file has two sections I would like to inquire about a solution to place the data into a database.

In section 2 it has product and ingredient info.  Because of this I am thinking this would make three tables in access.  

Table 1. Header info
Table 2. Product Info
Table 3. Product ingredient Info


Please advise with any ideas.  What I am trying to accomplish is importing hundreds of pdf files in this format into a database to do analysis rather than typing them in.

Any ideas will be appreciated.

Thanks
Test-Doc.pdf
0
Comment
Question by:MPDenver
2 Comments
 
LVL 85

Assisted Solution

by:Scott McDaniel (Microsoft Access MVP - EE MVE )
Scott McDaniel (Microsoft Access MVP - EE MVE ) earned 800 total points
ID: 39700618
You can do this manually by opening the PDF in Adobe Reader and Saving as Text, then import that Text file into Access. If you have hundreds of PDFs, however, that would be a time consuming task.

You cannot do this natively in Access VBA, unfortunately. You must use a 3rd party tool, like the Adobe SDK:

http://www.adobe.com/devnet/acrobat.html

There are also some other third party tools that might work:

http://www.foolabs.com/xpdf/download.html

xPDF converts a PDF to Text, so you could loop through all the files in a folder, convert the .PDF to .TXT and then use TransferText to move the data into your database.

You'd first create a Batch file that calls the xPDF utility, and then "run" that batch file for each PDF. The batch file would look like this:

@echo off
pdftotext.exe -layout %1

You'd then loop through the files, and call that batch file (let's name it ConvertPDF.bat):

Dim sFile As String
sFile = Dir("YOur Folder Directory here", "*.PDF")

Do Until Len(sFile) = 0
  ConvertPDF sFile
  Dir
Loop

Open in new window


After running this, you'll have a directory full of .txt files, so do the same basic thing:

Dim sTxt As String
sTxt = Dir("Your Folder Directory, "*.txt")

Do Until len(sTxt) = 0
  DoCmd.TransferText acImport, , "YourTable", sTxt
  Dir
Loop

See here for more info on TransferText: http://msdn.microsoft.com/en-us/library/office/ff835958.aspx

One caveat - xPDF can take a few moments to convert a PDF to Text, so you might want to put a "wait" function in. To do that:

Public Function Wait(NumSeconds As Integer)
  On Error Resume Next
  Dim dStop as Date
  dStop = DateAdd("s", NumSeconds, Now)

  Do While dStop < Now
     DoEvents
  Loop
End Function

Open in new window


You'd then call it like this:

Do Until Len(sFile) = 0
  ConvertPDF sFile
  Wait 5 '/ wait for 5 seconds
  Dir
Loop

Finally, you could combine the xPDF and TransferText, if you'd like, but that's another exercise ...
0
 
LVL 44

Accepted Solution

by:
Karl Heinz Kremer earned 1200 total points
ID: 39707458
Do you have Adobe Acrobat? If so, you can automate it via Access VBA. I wrote a few tutorials that illustrate how you would integrate Acrobat via VBA (and JavaScript):

http://khkonsulting.com/2009/03/adobe-acrobat-and-vba-an-introduction/
http://khkonsulting.com/2009/03/acrobat-javascript-and-vb-walk-into-a-bar/
http://khkonsulting.com/2010/09/reading-pdf-form-fields-with-vba/

You would need the Acrobat SDK (which in this case is just documentation), which you can download for free from here: http://www.adobe.com/devnet/acrobat.html
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft Access has a limit of 255 columns in a single table; SQL Server allows tables with over 255 columns, but reading that data is not necessarily simple.  The final solution for this task involved creating a custom text parser and then reading…
Windows Explorer lets you open cabinet (cab) files like any other folder. In VBA you can easily handle normal files and folders, but opening and indeed creating cabinet files takes a lot more - and that's you'll find here.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

571 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question