IT79637
asked on
OCR Data Acquisition
Hi Experts,
This is mostly a linux question, but at the bottom there is also a Windows question.
Does anyone know of non-commercial OCR software that can be configured to look for specific words (data recognition) in a scanned document? For example, on a scanned document I have the software search for "Purchase Order" or "PO" and other abbreviations of "Purchase Order." The software might return to my program "Found/Non-Found", the actual string found or even the location on the image where the string was found, (e.g. [x1,y1] upper left and [x2,y2] lower right).
Further more, can the OCR software scan the immediate area where the "Purchase Order" string was found and return that string (e.g. the purchase order number). The "immediate area" could be defined buy the program as upper-left & lower-right coordinates.
Another Purchase Order example is to find the name & address information of purchaser, delivery date, delivery location, item number, quantity, etc. Very ambitious.
I'm running Ubuntu, 7.10 gusty gibbon. That gives me immediate access to debian packages. Are there any Fedora RPMS, SuSE packages, Slackware, Mandriva ,Gentoo, Xandros, etc. that have packages, rpm or whatever they use to manage application software?
Are there any C libraries that aid in (1) OCR and looking for specific strings and (2) scanning a specific area of an image and return any data found?
For Windows Experts, are there OCX controls to do OCR and data acquisition?
Thanks much!!!
This is mostly a linux question, but at the bottom there is also a Windows question.
Does anyone know of non-commercial OCR software that can be configured to look for specific words (data recognition) in a scanned document? For example, on a scanned document I have the software search for "Purchase Order" or "PO" and other abbreviations of "Purchase Order." The software might return to my program "Found/Non-Found", the actual string found or even the location on the image where the string was found, (e.g. [x1,y1] upper left and [x2,y2] lower right).
Further more, can the OCR software scan the immediate area where the "Purchase Order" string was found and return that string (e.g. the purchase order number). The "immediate area" could be defined buy the program as upper-left & lower-right coordinates.
Another Purchase Order example is to find the name & address information of purchaser, delivery date, delivery location, item number, quantity, etc. Very ambitious.
I'm running Ubuntu, 7.10 gusty gibbon. That gives me immediate access to debian packages. Are there any Fedora RPMS, SuSE packages, Slackware, Mandriva ,Gentoo, Xandros, etc. that have packages, rpm or whatever they use to manage application software?
Are there any C libraries that aid in (1) OCR and looking for specific strings and (2) scanning a specific area of an image and return any data found?
For Windows Experts, are there OCX controls to do OCR and data acquisition?
Thanks much!!!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The part of question regarding data acquisition is a very difficult one. I'm looking for key word, such as Purchase Order on an image. Then want to find the purchase order data around the key word. That type of intelligence is significantly more difficult than vanilla OCR. The experts responses pointed me to several linux based packages.
Thank you all very much!!!