Searchable Adobe PDF Documents

Posted on 2014-08-28
Last Modified: 2014-09-11
Hey guys,

We received about 50 PDF documents and we want to search for text within these documents. I tried doing a search but nothing comes up. These look like scanned PDF's, how can i search for text on all of the documents?
Question by:Cobra25
    LVL 26

    Expert Comment

    If the pdf's are scanned then you can't search in them without ocr'ing them. They are no more than an image which obviously can't be content searched.

    You can Google "PDF OCR" for free or paid solutions, the quality of different ocr products greatly differs.
    LVL 51

    Expert Comment

    by:Joe Winograd, EE MVE
    If you have Adobe Acrobat (not Adobe Reader), then you already have OCR. Acrobat calls it Recognize Text in Version X (10) and Text Recognition in Version XI (11). You'll find it in the Tools section. Here's what it looks like in Acrobat XI Pro:

    Acrobat XI Pro OCR
    If you don't have Acrobat, I recommend the excellent (and free!) PDF-XChange Editor:

    They also have a PRO (non-free) version, but I think you'll find that the free version does everything you need — including OCR! Regards, Joe
    LVL 4

    Author Comment

    LVL 51

    Accepted Solution

    One other thought. Doing 50 documents manually would be painful, so you may want to consider a batch processing solution. Here's an EE article that discusses a batch conversion approach using Nuance's Power PDF Advanced:

    It is not free, but is reasonably priced, and as the article shows, they offer a 30-day free trial. Also, if these documents are coming in regularly, you may want to consider a Watched Folder approach, which is available in the same Power PDF Advanced product. Regards, Joe
    LVL 51

    Expert Comment

    by:Joe Winograd, EE MVE
    Our messages just crossed. Yes, that's exactly what I was talking about in my first post <http:#a40291063> — Adobe Acrobat's Text Recognition. I showed the screenshot from Acrobat XI Professional in that one. Here's the Recognize Text screen from Acrobat X Standard:

    Acrobat X Std OCR
    Regards, Joe

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Top 6 Sources for Identifying Threat Actor TTPs

    Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

    This article focuses on how to remove password security from multiple PDF files by Adobe Acrobat program. Sometimes it is essential to access the stored data items and to print, edit as well as copy content from Portable Document Format files in abs…
    If you get continual lockouts after changing your Active Directory password, there are several possible reasons.  Two of the most common are using other devices to access your email and stored passwords in the credential manager of windows.
    This Micro Tutorial will give you a basic overview of Windows Live Photo Gallery and show you various editing filters and touches to photos you can apply. This will be demonstrated using Windows Live Photo Gallery on Windows 7 operating system.
    The Task Scheduler is a powerful tool that is built into Windows. It allows you to schedule tasks (actions) on a recurring basis, such as hourly, daily, weekly, monthly, at log on, at startup, on idle, etc. This video Micro Tutorial is a brief intro…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now