Link to home
Start Free TrialLog in
Avatar of mikecox_
mikecox_Flag for United States of America

asked on

make a PDF file searchable

How can I convert an unsearchable PDF file into one that is?
Avatar of John
John
Flag of Canada image

Rescan it and then during the scan, select the scan option to make the document searchable. I do this and it works.

Once the PDF has been scanned as an image only, it cannot be converted to searchable. So re-scan it.
SOLUTION
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I tried on my simple minded scanner and that did not seem to work. Good luck if your scanner supports what Joe says and works.
> Good luck if your scanner supports what Joe says and works.

John,
It has nothing whatsoever to do with the scanner. It is OCR software that runs on the file after it has already been scanned. There are many, many software packages that can OCR existing (already scanned-in) files, such as ABBYY FineReader, PaperPort, Power PDF, OmniPage, and the list goes on-and-on. As I recollect, you have Adobe Acrobat. Try this. Open an unsearchable/image-only PDF in it that your "simple minded scanner" created, then select Tools, then Text Recognition or Recognize Text, depending on which version of Acrobat you have. It will create text with its OCR process right in the PDF, which you'll be able to search, as well as copy/paste into Notepad, Word, etc. This is completely unrelated to scanners/scanning (although, of course, lots of scanning software, such as the products mentioned above, can also OCR at scan time). Regards, Joe
I just scan to Adobe PDF and do not have a bunch of tools . I had a client with Abby Fine reader.  Cheaper and faster just to re-scan (for me at any rate).

So I was posting from a very simple minded approach. I am sure you are correct, but I only use very simple minded approaches.
I have Adobe open. No Recognize Text. I think that must be Adobe Pro.
Our posts just crossed...but you do have Acrobat...right? Not just Reader...but full Acrobat? If so, try what I suggested above on an image-only PDF — Tools>Text Recognition (or Recognize Text).
Our posts crossed again. It doesn't have to be Acrobat Pro. It can be Acrobat Standard. But it cannot be Adobe Reader.
I am trying, but no such thing in regular Adobe Acrobat.
OK, it is well hidden under Enhanced Scans. I will try it later.

I curse the day Microsoft "categorized" things and all vendors followed like lemmings. Nothing can be found anymore.
Mike - I am done now, over my head, and so over to Joe.
Maybe these screenshots will help:

Acrobat X Standard
User generated image
Acrobat XI Pro
User generated image
Avatar of mikecox_

ASKER

This is a rather large PDF file; it's the CC&R's of my condo association and it's the document our attorney provided.  I have the OCR software but I can't image having to print, then scan all the pages from the PDF file into it.  I was hoping that there was a program that would simply convert the file into a digital document that is searchable. It seems to me that I should be able to load that file into my OCR program and let it make the conversion.
> It seems to me that I should be able to load that file into my OCR program and let it make the conversion.

Yes, you should be able to do that. If you can't, the problem is with your OCR software. What OCR software do you have? If you can't get it to work with your OCR software, then read my earlier post — it explains exactly how to do what you want with free software. Regards, Joe
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I don't know if this is Kosher but since I think I had the best answer I'm selecting that as the best one.  Joe's was the next best.  I thank you all for your effort.
> As suggested I tried to highlight some text and cannot

We knew from your first post that you would not be able to highlight text and copy/paste it, because you said it is "an unsearchable PDF file", meaning there's no text in it to search...or highlight! So it was clear from your opening comment that it is an image-only (probably scanned-in) PDF.

> the .doc file he scanned

Scanning a DOC file to PDF is unnecessary. In most versions of word, you can Save As to a PDF file. And if that's not available, there are many free PDF print drivers out there, such as Bullzip, CutePDF Writer, and doPDF. And a big advantage of these methods (Save As and Print to a PDF print driver from Word) is that they create a PDF Normal file, which has the text that may be copied/pasted/searched.

> pay a subscription fee

Yes, true for Acrobat, but that was why I gave you the link to my 5-minute EE video Micro Tutorial, How to OCR pages in a PDF with free software:
https://www.experts-exchange.com/videos/1618/

> Finally, as I suggested above, it is possible to load a PDF file into an OCR program.

Yes, as I mentioned in my first post.

> I don't know if this is Kosher but since I think I had the best answer I'm selecting that as the best one. Joe's was the next best. I thank you all for your effort.

Yes, it's Kosher to select your own post. Here's a member article that discusses it:
https://www.experts-exchange.com/articles/27139

And here's an EE support article that discusses it:
http://support.experts-exchange.com/customer/portal/articles/626862

Regards, Joe
> asked by couldn't I just load the entire PDF file into it, but the question didn't appear to get noticed

Mike, that question did get noticed, and I replied with this (in post #a41945230):
>It seems to me that I should be able to load that file into my OCR program and let it make the conversion.


Yes, you should be able to do that. If you can't, the problem is with your OCR software. What OCR software do you have? If you can't get it to work with your OCR software, then read my earlier post — it explains exactly how to do what you want with free software. Regards, Joe
I have an OCR program and in a f/u quested asked by couldn't I just load the entire PDF file into it, but the question didn't appear to get noticed, so I tried it and it worked.
Mike,
Glad to hear that you tried it and it worked. As I mentioned earlier, your question did get noticed, and I replied in post #a41945230. In any case, great news that it's working! Regards, Joe
Thanks for the f/uj comments, I appreciate them and your efforts to help resolve this issue for me.
You're welcome, Mike — happy to help. I'm really glad to hear that the issue is resolved.