Solved

Poll a directory for PDF then OCR it

Posted on 2010-08-17
2
572 Views
Last Modified: 2012-05-10
Is there any software available that would poll a directory the went a non pdf file is found OCR the file then ove it to another folder ?
0
Comment
Question by:QGolden
2 Comments
 
LVL 2

Expert Comment

by:tsaikkala
ID: 33453096
You didn't include enough information about the system in which you are operating, so I will make some assumptions such as; you're using windows etc. My answer will cover your question, but will probably lead you to ask more about the specifics in another area of experts exchange.

You can create a vbscript (.vbs) or batch (.bat) file to do the "file checking". Using either of these methods is preferred as it is then a simple process to send a file that it detects to a command-line based OCR program like SimpleOCR (www.simpleocr.com), and then from the output of the OCR, send that raw text to a command-line based PDF program such as Text to PDF (verypdf.com/txt2pdf/index.htm).
Both of these programs have documentation that give you the format of the input, for easy parsing.

I feel this covers the question you asked here; but for the specifics of the script coding, you may want to ask a specific question in the Windows Batch section of experts exchange:
http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Batch/
0
 

Accepted Solution

by:
QGolden earned 0 total points
ID: 33453221
Thanks I will look into that but was hoping for a single piece of software that would do the job. I found that Abby fine reader corporate has a hot folder option which monitors a folder then creates a OCR PDF once a file is moved into that monitored folder, but it costs $500 was hoping for something a little less pricey.
0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
What is the cheapest way to learn Websphere MQ? 2 90
what font behind pdf 32 136
Why pdf file doesn't open ? 7 65
Adobe Reader DC printing black lines 4 14
Let’s list some of the technologies that enable smooth teleworking. 
Skype is a P2P (Peer to Peer) instant messaging and VOIP (Voice over IP) service – as well as a whole lot more.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

775 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question