?
Solved

Extracting text from a PDF document?

Posted on 2002-04-09
3
Medium Priority
?
240 Views
Last Modified: 2010-04-02
I'm building a small program that can extract all the text and metadata from a PDF document.

Speed is important so preferably no OLE automation solutions.

What library or SDK do I need for this! Is this a big task??
0
Comment
Question by:kbb2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 32

Accepted Solution

by:
jhance earned 400 total points
ID: 6927832
Adobe's Acrobat can do it:

http://www.adobe.com/support/techdocs/1c356.htm

There are free/open source solutions:

http://research.compaq.com/SRC/virtualpaper/pstotext.html

Here is a company with a library you can link into your app:

http://www.totalint.com/products/developer/PDFextractor.asp
0
 

Author Comment

by:kbb2
ID: 6927852
Thanks! Just what I needed!
0
 
LVL 30

Expert Comment

by:Axter
ID: 6927877
kbb2,
Could you please close this question, by awarding jhnace the points.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is the first in a series of articles about the C/C++ Visual Studio Express debugger.  It provides a quick start guide in using the debugger. Part 2 focuses on additional topics in breakpoints.  Lastly, Part 3 focuses on th…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.
Suggested Courses

801 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question