Solved

Best scanner software settings to get a good quality single-color (black) PDF rendering

Posted on 2014-04-17
3
673 Views
Last Modified: 2014-04-17
Hello,

When using scanning software, what are the best settings to get a good quality single-color (black) PDF rendering?

I am using the following:

        • Canon CanoScan LiDE 110 flatbed scanner
        • Canon MP Navigator EX 4.0 software

I set things up and then did a test scan using a single sheet of white paper containing black text. It scanned the paper but it seems to have taken longer than I would think it should take. Also, although the resulting file name is PDF (IMG.pdf), it seems more like a picture than a typical PDF doc that I'm used to (if that makes sense). For example, the white areas on the sheet (ie those with no type) have all of these gray-looking smudges which don't really exist on the original paper.

In case it's any help, the file size is 982 KB.

FYI, my main objective is to have settings which enable an assistant to scan a variety of documents (like stuff that comes in the mail) quickly and in a way that they can be filed as simple digital items in individual folders. The majority of the items will never need to be accessed but I would like them to be OCR-able in order to facilitate searches.

Thanks
0
Comment
Question by:WeThotUWasAToad
  • 2
3 Comments
 
LVL 52

Accepted Solution

by:
Joe Winograd, EE MVE earned 500 total points
ID: 40007340
> When using scanning software, what are the best settings to get a good quality single-color (black) PDF rendering?

I do virtually all of my scanning of typical business docs at 300DPI, Black&White (1-bit). This nearly always produces an image that is good enough to perform high-accuracy OCR (depends, of course, on the quality of the source doc and the capabilities of the OCR software).

> Also, although the resulting file name is PDF (IMG.pdf), it seems more like a picture than a typical PDF doc that I'm used to (if that makes sense).

It's almost surely a PDF image file. Nothing wrong with that, as that is what one would expect of a scanned doc — it is a bitmap/graphic inside a PDF file. If you run OCR on it (either at scanning time or after scanning), you could create a PDF searchable image file, which has both the scanned image (bitmap) and the (searchable) text created by the OCR process.

> For example, the white areas on the sheet (ie those with no type) have all of these gray-looking smudges which don't really exist on the original paper.

I don't have the Canon CanoScan LiDE 110, but that may be due to lack of proper maintenance/cleaning of the scanner. It could also be due to having too dark of a brightness setting when scanning.

> In case it's any help, the file size is 982 KB.

A typical page scanned at 300DPI/B&W(1-bit) with typical compression should be about 50KB. At 982KB, this means that it is higher than 300DPI and/or not B&W and/or not compressed. I don't use the Canon MP Navigator EX 4.0 software, but check the settings in there.

> FYI, my main objective is to have settings which enable an assistant to scan a variety of documents (like stuff that comes in the mail) quickly and in a way that they can be filed as simple digital items in individual folders.

Again, I don't use the Canon MP Navigator EX 4.0 software, but any decent imaging/scanning package will let you do that.

> The majority of the items will never need to be accessed but I would like them to be OCR-able in order to facilitate searches.

Good quality OCR will (usually) have no problem with 300DPI/B&W images.

To learn more about scanning settings, I recommend Wayne Fulton's excellent site, "A few scanning tips":
http://www.scantips.com/

For OCR tips, look at this section of his site:
http://www.scantips.com/basics04.html

Regards, Joe
0
 

Author Closing Comment

by:WeThotUWasAToad
ID: 40007590
Great response. Thanks!
0
 
LVL 52

Expert Comment

by:Joe Winograd, EE MVE
ID: 40007626
You're welcome. Happy to help. Regards, Joe
0

Featured Post

Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
cloning computer 13 85
Stream live video from Raspberry Pi camera 22 161
PDFMate free PDF Merger. Security concern 8 116
Issue with using GPO to install Mimecast in Outlook. 6 37
This guide will walk you through the essential considerations and tech stack for building scalable websites. Know how to grow your business the smart way!
All of the resources available today make learning a new digital media easier than ever-- if you know where to begin. This is a clear, simple guide to a few of the basic digital art mediums and how to begin learning them on your own.
Viewers will learn how to use Macros for greater control over Rack parameters in Ableton Live. Group devices into a Rack by selecting them and pressing Command-G (Ctrl-G on PC): Control-click (Right Click on PC) a parameter to access pop-up menu, …
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question