asked on

Need an OCR tool

I need a tool I can use to digitize a report, like the one attached here...

Will this kind of report get a 100% successful conversion rate?

Eventually, I need the tool to be part of my website, but I have not, as yet, chosen my back-end technology. For now, a simple Mac based tool is fine, just so I can hand convert a report that I can start to use in my programming of the back-end.

Windows is okay, if there are limited Mac FREE versions.

I do have Office 365 (Mac) if there is a tool in there which I can use.

I am also interested in hearing what "plug-ins" can work when I deploy this to my website, for online OCR conversions.

David Favor

Simple way to do this, is to convert .pdf files (most statements come in .pdf format) to text so you can parse it, via something like this...

pdftotext -enc ASCII7 -nopgbrk -layout '$file' >$outfile 2>/dev/null

Open in new window

Which every sensible Linux Distro + Macports provides by installing the poppler-utils package.

If you're scanning documents, consider a printer like a ScanSnap which will do a double-sided scan in <1 second/page + generate a .pdf file with an embedded text version, which can be parsed too.

curiouswebster

ASKER

I am not looking for a new printer, at the moment. I tried to install ScanSnap, but it required the printer. What is your second choice for a Mac?

Your pdftotex conversion util looks interesting. It would be most helpful for me to use the same OCR tool during my early research for this project as I would use when I deploy the website.

1) Is this free to use?
2) How do I install it on my Mac?

ASKER CERTIFIED SOLUTION

Joe Winograd

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

curiouswebster

ASKER

thanks

Joe Winograd

You're welcome. Good luck on the project!