How do I get PDF document (metadata) information into a database table?

maheflin25
maheflin25 used Ask the Experts™
on
We are looking at ColdFusion to read and populate metadata from over 5,000 pdf files using cfpdf tag is that doable, is there a better solution?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
from docs

getInfo action
    Use the getInfo action to extract information associated with the PDF document, such as the author, title, and creation date. You specify the name of the structure variable that contains the relevant data associated with the file, as the following code shows:

So you just need to get a list of your pdf's using cfdirectory

<cfoutput>

<cfdirectory action="list" directory="c:\temp\" filter="*.pdf" name="mypdfs">

<cfloop query="mypdfs">


<cfpdf action="getInfo" source="c:\temp\#mypdfs.name#" name="PDFInfo">
<p><cfoutput>#PDFInfo.title#</cfoutput></p>
<p><cfoutput>#PDFInfo.author#</cfoutput></p>
<p><cfoutput>#PDFInfo.keywords#</cfoutput></p>
<p><cfoutput>#PDFInfo.created#</cfoutput></p>

<!---
or doi your insert query in the loop
<cfquery...>
insert into mytbl (title, author, keywords, created) values ('#PDFInfo.title#', '#PDFInfo.author#', '#PDFInfo.keywords#', '#PDFInfo.created#'
</cfquery>
 --->

</cfloop>

</cfoutput>




Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial