[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 979
  • Last Modified:

PDF parser

Hi, Experts! I want to parse PDF files (in each file there is only one A4 paper) and extract text values and image objects. Is it any suitable component for this task?
0
Alexander_Savenko
Asked:
Alexander_Savenko
1 Solution
 
Tomas Helgi JohannssonCommented:
    Hi!

Here is an example how to get the Text from PDF file : http://www.swissdelphicenter.ch/torry/showcode.php?id=2169
Also there are several components on Torry : www.torry.net. Just Type in PDF in The Quick Search.
You could also Import the Adobe Reader Ocx file and gain access to the functions you need to extract
the text/images you need (similar to the above example).

Regards,
   Tomas Helgi
0
 
EddieShipmanCommented:
You can take the free .Net iText application, convert it to a DLL or ActiveX and use it in your Delphi application.
http://itextsharp.sourceforge.net/
0
 
den4bCommented:
Use freeware package Xpdf: http://www.foolabs.com/xpdf/download.html

It has several command line tools compiled for windows:
 * pdftotext.exe  - is for extracting text
 * pdfimages.exe  - is for extracting images
 * pdfinfo.exe  - if for extracting pdf tags
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now