Link to home
Start Free TrialLog in
Avatar of curiouswebster
curiouswebsterFlag for United States of America

asked on

Need an OCR tool

I need a tool I can use to digitize a report, like the one attached here...

User generated image
Will this kind of report get a 100% successful conversion rate?

Eventually, I need the tool to be part of my website, but I have not, as yet, chosen my back-end technology. For now, a simple Mac based tool is fine, just so I can hand convert a report that I can start to use in my programming of the back-end.

Windows is okay, if there are limited Mac FREE versions.

I do have Office 365 (Mac) if there is a tool in there which I can use.

I am also interested in hearing what "plug-ins" can work when I deploy this to my website, for online OCR conversions.
Avatar of David Favor
David Favor
Flag of United States of America image

Simple way to do this, is to convert .pdf files (most statements come in .pdf format) to text so you can parse it, via something like this...

pdftotext -enc ASCII7 -nopgbrk -layout '$file' >$outfile 2>/dev/null

Open in new window

Which every sensible Linux Distro + Macports provides by installing the poppler-utils package.

If you're scanning documents, consider a printer like a ScanSnap which will do a double-sided scan in <1 second/page + generate a .pdf file with an embedded text version, which can be parsed too.
Avatar of curiouswebster


I am not looking for a new printer, at the moment. I tried to install ScanSnap, but it required the printer. What is your second choice for a Mac?

Your pdftotex conversion util looks interesting. It would be most helpful for me to use the same OCR tool during my early research for this project as I would use when I deploy the website.

1) Is this free to use?
2) How do I install it on my Mac?
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You're welcome. Good luck on the project!