Solved

Converting pdf to searchable text format

Posted on 2004-08-11
5
3,119 Views
Last Modified: 2006-11-17
Hi all,

I have a pdf file which seems to be just pages of scanned pages (I can't search for specific words). I would like to convert this file to a pdf where I can search the text. Is there some kind of OCR package which would do this?

Thanks,
Freerider.
0
Comment
Question by:Freerider
  • 2
5 Comments
 
LVL 11

Expert Comment

by:lbertacco
ID: 11770899
If you have office2003 you can print it to "Microsoft Office Image Writer" printer, then open it with "Microsoft Office Document Imaging" and click on Tools->send text to word
0
 
LVL 44

Accepted Solution

by:
Karl Heinz Kremer earned 100 total points
ID: 11770962
You can use Adobe Acrobat (the full version): It comes with "Paper Capture", which is an OCR engine. If you don't have Acrobat. Other options are ScanSoft's OmniPage Pro (http://www.scansoft.com/omnipage/) or the Abbyy FineReader (http://www.abbyy.com/finereader/).
You have several options when you convert your image-only PDF: You can convert everything to "real" text and graphics, which may not be your best solution, because you very likely will end up with a mix of recognized text and not recognized text, which will stay as scanned image. This means that your characters in your text will change from read characters to the scanned images, and this is visible even to the untrained eye. You can avoid this by selecting "image with hidden text", where the original scanned image will be used for display and printing purposes, but the recognized text will be stored behind the image (in the correct location). This means that you can index and search the document. When you find a term, the correct section of the document will be highlighted, but you still have the high quality scan that you started with when you view or print the document.
0
 

Author Comment

by:Freerider
ID: 11863285
Thanks khkremer,
Finereader does the job. The only problem I have now is the bookmarks from the original document have been removed. Any idea how to get them back? I've downloaded a few trial programs but nothing seems to work.

Freerider.
0
 
LVL 44

Expert Comment

by:Karl Heinz Kremer
ID: 11863758
Try this: Take the original file (with the bookmarks) and open it in Acrobat, then select Document>Pages>Replace and select to replace all pages with the pages from your OCR'ed document.
0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How to populate data from foxpro table to pdf file 14 85
Problems with Adobe Acobate 9 5 113
Why pdf file doesn't open ? 7 66
Adobe Standard 8 with Windows 10 3 26
One of the questions I get asked again and again is how to validate a field value in an AcroForm with a custom validation script. Adobe provided a lot of infrastructure to do that with just a simple script. Let’s take a look at how to do that wit…
PaperPort is a popular document imaging/management product from Nuance Communications (http://www.nuance.com/). It is in widespread use by both individuals (http://www.nuance.com/for-individuals/by-product/paperport/index.htm) and businesses (http:/…
In this third video of the Xpdf series, we discuss and demonstrate the PDFtoText utility, which converts PDF files into plain text files. Download and install the software.: You may have already downloaded and installed the Xpdf tools while watching…
In this sixth video of the Xpdf series, we discuss and demonstrate the PDFtoPNG utility, which converts a multi-page PDF file to separate color, grayscale, or monochrome PNG files, creating one PNG file for each page in the PDF. It does this via a c…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question