PDF Fields

I want to read the fields in a PDF file , is there an adobe component, or other component, that I can reference in Access/VBA and/or inVB.Net to reliably read the PD document fields with a view to extracting the data and updating a database?

I appreciate that adobe components may require purchase of their products but I need to know which one.  I would prefer not to have to use a third party software as this would require updates/licensing and I will be supplying to a company who already have adobe pdf writer installed.


kokaneAsked:
Who is Participating?
 
Karl Heinz KremerConnect With a Mentor Commented:
Here is an example I just did with Word: I created a button and in the button handler I am reading two values from a PDF file. The program uses the JSO (the JavaScript Object, you can read more about that in one of my blog posts: http://www.khk.net/wordpress/2009/03/11/acrobat-javascript-and-vb-walk-into-a-bar/)


Private Sub CommandButton1_Click()
    Dim AcroApp As Acrobat.CAcroApp
    Dim theForm As Acrobat.CAcroPDDoc
    Dim jso As Object
    Dim text1, text2 As String
    
    Set AcroApp = CreateObject("AcroExch.App")
    Set theForm = CreateObject("AcroExch.PDDoc")
    theForm.Open ("C:\temp\sampleForm.pdf")
    Set jso = theForm.GetJSObject
    
    ' get the information from the form fiels Text1 and Text2
    text1 = jso.getField("Text1").Value
    text2 = jso.getField("Text2").Value
    
    MsgBox "Values read from PDF: " & text1 & " " & text2
    theForm.Close
     
    AcroApp.Exit
    Set AcroApp = Nothing
    Set theForm = Nothing
     
    MsgBox "Done"
End Sub

Open in new window

0
 
Karl Heinz KremerCommented:
You can use the Acrobat IAC interface to read form fields in your VB/VBA or .NET application. Take a lookecat http ://www.adobe.com/devnet/acrobat for more information. If you want a free 3rd party component, try the .Net version of Text: http://sourceforge.net/projects/itextsharp/
0
 
kokaneAuthor Commented:
Thanks khkremer that give me a healthy nudge in the right direction.

However I  would like more specific information regarding  the adobe component that would allow me to read PDF fields in vba/vb.net. Specifically I need to know which products it ships with so I can check if the customer I want to supply the software to has already got it installed as he already has a PDF writer and also because i want to be able to download it and trial it with Access VBA and or VB.Net. The component mentioned in answer was Acrobat IAC interface  and I was directed to the Acrobat website to locate further information rather than the specfic information I require.  There seems a real shortage of anyone saying this method is successfull when I look on Google which leads me to beleive that it may work in theory rather than in practice. I am also hoping to use VB.NET Exress 2008 or VBA and most references seem to be for problems witihin Visual Studio and be out of  date.

I find it very suprising that I cant find a single working example on the web -  which leads me to think the 3rd party route is the only viable one and  itextsharp for example seems to have many posts suggesting compatability problems.
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Karl Heinz KremerCommented:
It's part of Adobe Acrobat. You can find the API here: http://livedocs.adobe.com/acrobat_sdk/9.1/Acrobat9_1_HTMLHelp/IAC_API_OLE_Objects.103.1.html (click on the little button in the upper left corner to show the navigation pane.

Let me see if I can whip up an example for you. So, just to recap, you want to read a form field in VBA. I don't use Access, but I can give you a Word or Excel example.
0
 
kokaneAuthor Commented:
khkremer,

That is excellent thanks . Looks very doable - presumably it is helpful if the field names are meaningful, or are they just named  Text1-n sequentailly?

My only remaining problem is to work out which Acrobat product I have to buy -I asked this from Adobe technical support some time ago and the guy firstly insisted that there was no such thing and then directed me to the hlp pages - Ideally I would like to download a trrial version and test it rather than buying it and finding there is a problem. Dont like to be greed but any ideas? I 'm off now to have a look.
0
 
Karl Heinz KremerCommented:
You can name the fields whatever you want - of course, having meaningful names is better than just using GUIDs for example.

You need Adobe Acrobat - the full version (either Standard, Pro or Pro Extended). The free Reader does not provide these APIs.

There is a 30 day free trial for Adobe Acrobat Pro and Pro Ex at http://www.adobe.com/products/acrobat/ - just click on the "Try" button below the product image.
0
 
kokaneAuthor Commented:
khkremer

Thanks again I was just returning to say I  was downloading the Pro Verison having read your previous article on the topic - I'm developing a warm feeling that this fecking thing is going to work.
   
0
 
Karl Heinz KremerCommented:
Let me know if you run into problems. As I said, I am not familiar with Access, but anything else I can help you with.
0
 
kokaneAuthor Commented:
I have downloaded code now compiles in Access and Excel (2003)  but I get an error on the read field - I have checked the pdf document to grt the name of one field - in this case has a name of 'WB1.2.Title' which I am using "Field1 = jso.getField("WMB1.2.Title").Value"   I have References Adobe Arobat 9.0 Type Library and Acess 3.0 type Library selected.

The error is Run-Time error '91' Object variable or With Object variable not set - the same in both Excel and Access. This error is caused by "You attempted to use an object variable that isn't yet referencing a valid object."

It does not complain with the line "Set jso = theForm.GetJSObject" which seems strange given the error message.

Is there some reference to the Javascript I am missing in the VBA? Alternatively is there a simple command other than getfield that will tell if it can do anything and perhaps it is just this type of command that is not working?

I will try tomorrow on my other machine which has got office 2007 - bit someone on your website said they had got it (or soemthing similar) working with 2003.

0
 
Karl Heinz KremerCommented:
Is this a Designer form?
0
 
kokaneAuthor Commented:
Is this a Designer form?

Sorry not sure what that is - a type of PDF document? I know they use to collect company information and they have a PDF writer.
0
 
Karl Heinz KremerCommented:
There are two different form systems build into Acrobat: AcroForms and XFA or Designer forms. Bring up the document information (Ctrl-D) and see what application created the document.
0
 
kokaneAuthor Commented:
PDF Producer: Acrobat Distiller9.0.0 (Macintosh)
PDF Version 1.6 (Acrobat 7.x)
 
0
 
Karl Heinz KremerCommented:
Would you be able to share the file with me? It looks like an Acroform, so it should work with the code you have.
0
 
kokaneAuthor Commented:
Prefer if not publically viewable - any other means of sending it ?.  Perhaps I can amend it.

I just tried it with another PDF (template1 which comes with the version 9 )and it worked first time, then I updated the field and tried it again and it failed with the same error as before. Will retry that in case I didi something silly.
0
 
kokaneAuthor Commented:
Yes exactly the same error when I try to read an Adobe File (template1) which has been updated by me but fine if not updated.  i will try another field in the updateddocument to see if it is the field or the fact that the document has been updated.
0
 
kokaneAuthor Commented:
Apologies just tried that agian and sucessfully reading the updated file. It is half 2 here in the UK beginning to flag.

Is there anyway I can send you the file privately
0
 
kokaneAuthor Commented:
I have now realised that it is folder problem I was pointing it to the wrong place - apologies and thanks for your help.

0
 
kokaneAuthor Commented:
Please reply to this so I can award you the points - and thanks again.
0
 
Karl Heinz KremerCommented:
See my profile page, there is an email address you can use to send the file to me.
0
 
kokaneAuthor Commented:
khkremer:

As a supplementary question (greedy type me) any idea how you can extract ALL the fields names in a PDF document - this would remove the requirement to identify them first and simply move the values to corresponding DF fields which you could dynamically create?

0
 
kokaneAuthor Commented:
Moderator,

I did not email to reolve the issue  - the confusion has arisen because I mis-implmented a previous answer i was given and realised after a number of other message were posted. Is it acceptable that I now seek addtional information to my original question AFTER I  have selected the answer from above?  
0
 
Karl Heinz KremerCommented:
SouthMod - I don't think we've met before. As you can see, I've been doing this for quite some time, and I think I know how EE works. Sometimes it is necessary to have access to a file that the asker does not want to make available publicly (for a number of reasons), and in such cases, I offer to receive such a file in a private email to me. That does not mean that the question would be resolved in public email - it's just use to provide information that is required to resolve the issue, the remainder of the resolution would still be posted on EE (e.g. as in "you've used the wrong field name when trying to extract data from the document, replace xyz with abc and it will work"). Also, I don't understand why you are under the impression that kokane did answer his/her own question. As a moderator stepping into a long exchange of information, I expect you to read the whole exchange, and understand how we ended up here. I did clearly provide a sample script that the asker used to validate that Acrobat can actually be controlled via VBA (in http:#a33743786).
0
 
Karl Heinz KremerCommented:
kokane - I think you should accept the comment that has the code snippet as the answer to this question. The discussion we had afterwards was about getting Acrobat and do not really provide any more information.

As far as your related question goes, I would post a second question, and reference this question in it. This way, if somebody else wants to participate, they will have all the information.

Hope that helps.
0
 
Karl Heinz KremerCommented:
Now that the question is closed, there is a link to "ask a related question" in the comment box, use that to create your new question.
Thanks.
0
 
kokaneAuthor Commented:
SouthMod

"Which seemed to indicate that the Author had resolved their issue outside of EE assistance. Of course, if I'm wrong, then the author is free to choose whatever method is appropriate to close the matter."

I obviously should have made that clearer - I had not implemented the solution correctly as given earlier by khkremer.  As soon as I realised this I sent the message expalining that I was looking in the wrong folder.

On another point  I was advised via the on-line help to make a request for assistance I now undrestand I should have simply added another comment to my question - can you close the request down please or tell me how to do it.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.