[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 762
  • Last Modified:

Looking for OCR software suggestions

Hello,
Here's my scenario. I have a few thousand PDF's of scanned hand-written documents. I want to capture the data in these documents using OCR software. Any suggestions on OCR software  for this type of thing? Being that everything is hand-written I definitely want something that's going to handle handwriting pretty well. I've not done a whole lot with OCR in the past, so any other suggestions or tips are welcome.

Thanks!!
0
Haze0830
Asked:
Haze0830
  • 3
  • 2
1 Solution
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
To be clear on terminology, the term OCR (Optical Character Recognition) is typically used when the source doc contains typewritten material, while the term ICR (Intelligent Character Recognition) is typically used when the source doc contains handwritten material. ICR that I've used works reasonably well on carefully written, block letter (upper case) printing, is mediocre on upper and lower case printing, and performs badly on cursive writing. In fact, cursive is a distinct enough issue that some folks are using a relatively new term to describe it – Intelligent Word Recognition (IWR). So the first question for you is this: What is the handwriting like on your few thousand PDFs? Given the nascent state of IWR, if there's a lot of cursive writing on your docs, you're likely to be very disappointed in the recognition accuracy. Regards, Joe
0
 
Haze0830Author Commented:
Ah, very good to know. Thank you.

In my case it would be upper and lower case letters. More numerical entries than words though.

No cursive.
0
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
In recent years, the bulk of my OCR/ICR has been with ABBYY FineReader and Nuance OmniPage:

http://www.abbyy.com/
http://nuance.com/for-individuals/by-product/omnipage/index.htm

My guess is that they'll both do a mediocre job on your docs. Last time I looked, ABBYY had a free trial, while Nuance did not, but Nuance does provide a 30-day, money-back guarantee (which I know for a fact that they honor), so you may try them both risk-free to see how they do. I already have both installed and I'd be happy to run a test for you on a few docs to compare their recognition accuracies (be careful not to post any docs with sensitive/private info - redact them if necessary).

I'm heading to a meeting now, so won't be able to reply for a few hours, but will check for a message from you as soon as I return. Regards, Joe
0
 
Haze0830Author Commented:
Well, after having spent the remainder of yesterday demoing FineReader and the first half of today playing with Omnipage 18, my thoughts are as follows:

This will not be easy, dare I say possible.

FineReader did a much better job by comparison to Omnipage, but it didn't take much being that Omnipage crashes every single time I try to make a correction during the Proofing process. EVERY...SINGLE....TIME. So I really can't even say how well Omnipage works - because apparently the software just doesn't work...

FineReader did ok once I went through and proofed an entire document, but only for that particular document. If I scanned another document written by the same person it still wasn't very accurate.

I'm going to see if I can figure out what the deal is with Omnipage...
0
 
Joe Winograd, EE MVE 2015&2016DeveloperCommented:
I was expecting the low accuracy with both FR and OP, but was not expecting OP to crash. I don't recollect having many, if any, crashes with OP18 (or with earlier versions of OP, and I've had several). As you attempt to figure out what the deal is with OP, you may want to take advantage of the new Nuance OP forum (it's free to join):
http://nuance-community.custhelp.com/hives/dfe2a72c54/summary

It went live just last month, so there's not a lot of activity yet, but I know there are some knowledgeable folks involved with it. Worth a shot. Regards, Joe
0

Featured Post

Get quick recovery of individual SharePoint items

Free tool – Veeam Explorer for Microsoft SharePoint, enables fast, easy restores of SharePoint sites, documents, libraries and lists — all with no agents to manage and no additional licenses to buy.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now