• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3891
  • Last Modified:

Batch convert PDF to searchable format

I have many files that are scanned to pdf format.  I have a copy of Adobe Acrobat 7.0 that came with my scanner.  I can use the menu option Recognize Text using OCR to make my files searchable within Acrobat Reader and with Google Desktop.

I have 100's of files.  Is there anyway to automate or script the conversion from image/pdf to searchable pdf so that I can do an entire folder at the same time.

2 Solutions
Choose Advanced>Batch Processing
Click the New Sequence button.
Give sequence name.
Click Select Commands
Select all the commands that you like that Batch to do
Save it and then Run a Batch Sequence

Place all the files you wish to process in a single folder on your hard drive.
Choose Advanced>Batch Processing
Select the sequence to run (just what you have saved)
Click OK
Select the folder to process
Select the Output Folder


We're using OmniPage Pro for this function.  We originally tried a batch convert feature from Adobe.  It was expensive and required regular "dongles" to be purchased.  

OmniPage lets us do all our files without limit (We've done over 40,000 documents to date - long lease documents - perhaps 60 pages each).

--Jim Christmas
Forced accept.

Community Support Moderator
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Mr Xmas
I've been recently charged with converting 1000's of pdf files to searchable pdf''s on a client network. Many of these documents are 20+MB large. We have purchased the Omnipage Pro and am using the Batch Manager to process the files but it is taking a crazy amount of time to do this. I suspect there is something that I am doing wrong. I've had a single batch runing over the entire weekend consisting of 8 files, about 23MB each, and not even one file is complete and it looks like it was skipping some. This morning I pick one of the files to do and it appears to be doing it, but about 1 page every 15-20 seconds out of the few hundred page document. At this rate I will die an old man before the files get completed. What would you recommend?

The OCR is what takes a long time, especially if your files are scanned at a fairly high resolution.
One thing to try is copying the files down to the local machine before processing, and having Omnipage save the files to the local machine.  
For our process I found this makes a huge difference.  In many cases, someone in our department will scan a lease (a 60 paage document on average) and the file will be converted a minute or two after they return to their desks.

Hope it helps,

Jim Christmas
is there a way to create a batch so that the original directory structure is kept. We don't want to lose that and its incredibly cumbersome to do the batching one little directory at a time. The client is a law firm and we are trying to get all of the old pdf's made into searchable and there are hundreds of directories within directories etc.

I'm not sure what version of OmniPage you're using.  We've got v 15 Pro.  When I go into the batch manager and create a new job, I can tell the system to include subfolders by picking a box, then clicking a checkbox right next to the folder.  The checkbox itself doesn't say "include subfolders" but if you read the text at the top of the screen, it says tells you that the checkbox is for including subfolders.

Then in the save options we use TEXT - PDF with Image on Text.  And I believe you can choose to save the files with the original file names into subfolders.  I've never actually tried this part myself, but the option does appear to be there on the screen.

Good luck,

Jim Christmas

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now