Then you need some OCR software, as CodedUK has pointed out above. Note that some scanners come with OCR software - you may already have it, without realising it - either on the original CDs that came with the scanner, or possibly as an unused and unrecognised program on your PC.
If it turns out you need to buy OCR software, then I would recommend Abbyy Finereader. You can buy it new - see http://www.abbyy.com/ - and the current release is at 8.1, though I use v 5.0 which I find to be excellent (and it runs fine on Windows XP). My copy came free with a PC mag, but you will easily find it at a reasonable price on Ebay. Any of the recent versions will do the job.
The other thing you will need to do once you have OCR software up and running, is to make sure your input JPEG files are suitable. To make OCR work well, the following requirements are generally essential:
a) the text has to be clear - nice, well formed black letters on a clean white background is what you are aiming for. Smudges, paper tears, tea stains will all cause you problems.
b) the text needs to be straight. In other words if the lines are at an angle with respect to the straight edges of the paper, you will get "read errors". Some OCR software has a "deskew" function built in (Finereader has).
c) the language of the text needs to be one that your OCR program recognises. It isn't just reading invidual letters, it is trying to recognise whole words, and thus will have a dictionary built in. Naturally you need to tell it which language the input text is in.
One thing to be wary of if using JPEG files are the so-called artefacts that the JPEG compression adds to images. These may also cause errors when OCR'ing. They can be reduced by setting the JPEG compression to "high quality" or "low compression" when you are taking the scan.
Hope that helps
Richard
Main Topics
Browse All Topics





by: CodedKPosted on 2007-02-22 at 11:45:59ID: 18590761
Hi sal1150.
m
I think you want to build OCR application.
Try the following components :
http://www.imagelib.com
http://www.leadtools.com
http://www.pegasustools.co
http://www.simpleocr.com
Hope this helps :)