Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium


I need to ocr a batch of pdfs and rename the file according to keyword

Posted on 2011-10-10
Medium Priority
Last Modified: 2013-11-10
Here is the scenario:
I have a batch of PDF files that need to be scanned and saved to folder. They also need to be OCR'd and then renamed to particular number that is found in the file itself so that our internal software can access these scanned files by that number.

Is there an all in one software for this job?
Question by:vvajjhala
LVL 33

Accepted Solution

Paul Sauvé earned 336 total points
ID: 36950678
I have a batch of PDF files that need to be scanned and saved to folder.
Either you have PAPER documents that you want to scan (digitalize) as pdf files OR you have PDF files (already scanned) that you want to sort and rename! Why would you want to "scan" a pdf document?
They also need to be OCR'd
Perhaps the documents have been scanned as "images" of text and you to transform then to text pdf's?
then renamed to particular number that is found in the file itself
Is the "particular number" in the same place in each file?

Assisted Solution

gimosuby earned 332 total points
ID: 37106947
I know Kofax has the software that will be able to do that, we use that here at work. It's called Kofax Capture. However you have to be willing to spend the money and time to set it up.

Alternatively, if you know some programming, you could use Tesseract (http://code.google.com/p/tesseract-ocr/) as OCR engine and Pdftk (http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) to split up the PDF's at the right pages. This solution won't cost you a penny, but it'll definitely cost you time.

If you want a free GUI for OCR, there is FreeOCR (http://www.freeocr.net/). This uses Tesseract as OCR engine.

Hope this helps.
LVL 16

Assisted Solution

by:Bryan Butler
Bryan Butler earned 332 total points
ID: 37154228
If you need to convert to text to grab the number for the name, this post talks about converting pdf to text:


LVL 31

Expert Comment

by:James Murrell
ID: 37283973
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.

Featured Post


Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
This article will show how Aten was able to supply easy management and control for Artear's video walls and wide range display configurations of their newsroom.
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…

571 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question