Extracting text from a PDF document?

I'm building a small program that can extract all the text and metadata from a PDF document.

Speed is important so preferably no OLE automation solutions.

What library or SDK do I need for this! Is this a big task??
kbb2Asked:
Who is Participating?
 
jhanceConnect With a Mentor Commented:
Adobe's Acrobat can do it:

http://www.adobe.com/support/techdocs/1c356.htm

There are free/open source solutions:

http://research.compaq.com/SRC/virtualpaper/pstotext.html

Here is a company with a library you can link into your app:

http://www.totalint.com/products/developer/PDFextractor.asp
0
 
kbb2Author Commented:
Thanks! Just what I needed!
0
 
AxterCommented:
kbb2,
Could you please close this question, by awarding jhnace the points.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.