OCR Printing of Old Typewritten Documents

How can I scan 60 to 90 year old typewritten manuscripts into my Windows 8.1 using computer, using ABBYY Finereader 12 and an Epson 4630 printer/scanner, and get decent character recognition, which I am not getting now? Perhaps I am using the wrong term in "character recognition" because the scanned document is a duplication of the original, but the final product is awful. I am using 300 dpi, PDF,and gray scale as recommended. I have not used an OCR program before because I recently purchased the ABBYY and Epson 4630 specifically for the purpose of converting multiple unpublished manuscripts into printable documents. After many, many, hours, I am getting nowhere!
johncurryAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Joe Winograd, Fellow&MVEDeveloperCommented:
Hi John,

You are using the almost correct term – instead of "character recognition", it is "optical character recognition". This is what you have in the title – OCR. But "OCR Printing" is not correct – you want something like, "OCR when scanning".

OK, with terminology out of the way, let's talk about scanning 60-90 year old docs. Regardless of age, the accuracy of OCR depends heavily on the quality of the document. I would venture to guess that your 60-90 year old typewritten manuscripts are not of good quality.

ABBYY FineReader is an excellent OCR package – you have the latest V12 (I have V11). Your setting of 300DPI/grayscale (8-bit) is fine, although I do almost all scanning at 300DPI/black&white (monochrome/1-bit), which for most typical docs leads to accurate OCR. Occasionally, I'll use 300DPI/grayscale, and even less frequently, 600DPI/black&white.

My guess (and it's just a guess) is that your old docs are too faded/light for good accuracy, although the problem may be the quality of your scanner. It is almost surely not a problem with ABBYY FineReader.

I suspect your manuscripts have private/sensitive information, but if you can find one innocuous page and post it as a PDF, I'll try to OCR it with many software packages that I have. It would be good to post three scanned versions: 300DPI/black&white, 300DPI/grayscale, 600DPI/black&white. Regards, Joe

Update:
One other point. You said, "for the purpose of converting multiple unpublished manuscripts into printable documents." To be clear, the real purpose of OCR is to create text so that documents are searchable. If all you want is to be able to print them, you don't need OCR. You could scan to an image-only PDF (with just a raster image/bitmap/graphic) and print that. Regards, Joe

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
nobusCommented:
to add to the above - try scanning 1 document on different scanners, put it on usb stick, and ocr it then on your PC to see if there are differences
there are scanner brands that include software enhancements of the image - you could search for one that has it
contact the main sellers like HP

you can also use a software like this :  http://www.stoik.com/products/business-solutions/sdk/document-image-enhancement-sdk.php
Joe Winograd, Fellow&MVEDeveloperCommented:
Hi John,

I just realized in my last post that I didn't explain how to do an image-only scan in ABBYY FineReader. When you do the Save as PDF Document after scanning, click the Options button and you'll see this:

ABBYY Save image only
The default is Text under the page image, so you'll need to select Page image only to avoid the OCR step (all three Text... choices will result in OCR being done). That screenshot is from FR11, but I suspect that it's the same, or very similar, in FR12. Regards, Joe
Acronis True Image 2019 just released!

Create a reliable backup. Make sure you always have dependable copies of your data so you can restore your entire system or individual files.

nobusCommented:
you can also have them treated in a print shop - and ask for warranty printing !(ask for examples)
hdhondtCommented:
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
Joe Winograd, Fellow&MVEDeveloperCommented:
Hi hdhondt,
Thanks for the cleanup and the awarded solution — much appreciated! Regards, Joe
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
OCR

From novice to tech pro — start learning today.