?
Solved

Parsing the .pdf file

Posted on 2005-03-29
4
Medium Priority
?
248 Views
Last Modified: 2010-04-17
How to Parse the .PDF file and extract the text from it using C++.
are there any libraries or Components available.
0
Comment
Question by:kanthikumarp
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 10

Accepted Solution

by:
ADSaunders earned 500 total points
ID: 13652309
Hi,
I once had a problem attempting to index the text of a pdf file. I no longer have the source, but I hacked it from the sourcs in XPDF http://www.foolabs.com/xpdf/
.. Alan
0
 
LVL 6

Assisted Solution

by:guitaristx
guitaristx earned 500 total points
ID: 13652325
Adobe's specification for the PDF file format is available here:
http://partners.adobe.com/public/developer/pdf/index_reference.html

There are plenty of commercial toolkits available to view and manipulate PDF files; however, your question might be a bit over-simplified.  There's a LOT to the PDF file format, and the complexity of the commercial toolkits reflects that.  Can you be more specific as to what you're needing?
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
What do responsible coders do? They don't take detrimental shortcuts. They do take reasonable security precautions, create important automation, implement sufficient logging, fix things they break, and care about users.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
Starting up a Project
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question