Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Dump PDF to Text ?

Posted on 2010-11-26
13
627 Views
Last Modified: 2012-05-10
Im looking for a free or opensource component to Dump PDF to Text ?

Please do not post unless you have a free or opensource method of accomplishing this.

THanks
0
Comment
Question by:Looking_4_Answers
13 Comments
 

Author Comment

by:Looking_4_Answers
ID: 34220496
Sorry, Delphi 2010
0
 

Author Comment

by:Looking_4_Answers
ID: 34220498
Actually, i would be more interested in plain ole code versus a component
0
 
LVL 45

Expert Comment

by:aikimark
ID: 34223685
@Looking

What criteria do you have for the resulting text file?

How was the PDF created?  -- The possible solutions differ.
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 26

Expert Comment

by:EddieShipman
ID: 34226186
http://www.swissdelphicenter.ch/en/showcode.php?id=2169

Here's the German->English translation, via Google Translate, of the comments:

So now I've finally found a solution how to
 the entire text from a PDF file (also possible with multiple pages)
 can read.
 I have been in the same time vorherein for my dirty
 Programming apologize, but I hope you can still what
 to start! The form includes a TMemo, 5 TLabel, TButton 1
 and an open dialogue

 oh yes, you have to insert before or a type library
 opens to the type library import dialog (project
 ) And find the folder when adding Selects Adobe Acrobat.
 There you should find a file named Acrobat.tbl, if not then
 Just look for times.
 Now for the install and then applying unit ready.
 have fun

I, personally, have not tried this.
0
 

Author Comment

by:Looking_4_Answers
ID: 34227495
@aikimark:

No Criteria, just dump the entire pdf to text

@EddieShipman:

Thanks, i will give that a try
0
 
LVL 32

Expert Comment

by:ewangoya
ID: 34227743
0
 
LVL 45

Expert Comment

by:aikimark
ID: 34227881
@Looking

If the PDF only has an image layer, then there is an extra step required to recognize the characters/words in the image and add a text layer to the PDF.  People make this mistake when they convert TIFF images to PDFs and then wonder why they don't find anything when they do an Adobe search for words.

Did you look at the PDFText utility?
http://www.glyphandcog.com/textext.html
0
 

Author Comment

by:Looking_4_Answers
ID: 34230927
@aikimark:

Yes i looked at it. It is not free.....at least the portion (commandline utility) that would be useful to me
0
 
LVL 45

Expert Comment

by:aikimark
ID: 34231095
For some reason, I'm only seeing XPDFText at that site.  I'll look for the free PDFText utility.
0
 

Author Comment

by:Looking_4_Answers
ID: 34231142
Thanks....and right now., i am looking at these two free options:

http://www.foolabs.com/xpdf

http://mupdf.com 
0
 
LVL 45

Expert Comment

by:aikimark
ID: 34235669
I'm pretty sure that I got my utility from foolabs.  The README documentation states that it is opensource.
http://www.foolabs.com/xpdf/download.html
0
 

Author Comment

by:Looking_4_Answers
ID: 34239902
so , which file do i download for windows XP, delphi 2010?

Also, can yiu tell me which file to call and how to pass the params?
0
 
LVL 45

Accepted Solution

by:
aikimark earned 500 total points
ID: 34241143
1. download
ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl5-win32.zip

2. unzip the contents.

3. open a command prompt window and navigate to the unzipped directory.

4. Issue the following command:
PDFTEXT -?

5. read the displayed text

6. play with the different command line switches using your PDF files until you are satisfied with the result.

7. Use the ShellExecute() function in your application.
Refr:
http://delphi.about.com/od/windowsshellapi/a/executeprogram.htm
http://www.tek-tips.com/faqs.cfm?fid=5462

=========
You will probably need to wait/sleep for a moment for the shelled process to finish.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Objective: - This article will help user in how to convert their numeric value become words. How to use 1. You can copy this code in your Unit as function 2. than you can perform your function by type this code The Code   (CODE) The Im…
Introduction The parallel port is a very commonly known port, it was widely used to connect a printer to the PC, if you look at the back of your computer, for those who don't have newer computers, there will be a port with 25 pins and a small print…
Two types of users will appreciate AOMEI Backupper Pro: 1 - Those with PCIe drives (and haven't found cloning software that works on them). 2 - Those who want a fast clone of their boot drive (no re-boots needed) and it can clone your drive wh…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question