Best scanner software settings to get a good quality single-color (black) PDF rendering

Hello,

When using scanning software, what are the best settings to get a good quality single-color (black) PDF rendering?

I am using the following:

        • Canon CanoScan LiDE 110 flatbed scanner
        • Canon MP Navigator EX 4.0 software

I set things up and then did a test scan using a single sheet of white paper containing black text. It scanned the paper but it seems to have taken longer than I would think it should take. Also, although the resulting file name is PDF (IMG.pdf), it seems more like a picture than a typical PDF doc that I'm used to (if that makes sense). For example, the white areas on the sheet (ie those with no type) have all of these gray-looking smudges which don't really exist on the original paper.

In case it's any help, the file size is 982 KB.

FYI, my main objective is to have settings which enable an assistant to scan a variety of documents (like stuff that comes in the mail) quickly and in a way that they can be filed as simple digital items in individual folders. The majority of the items will never need to be accessed but I would like them to be OCR-able in order to facilitate searches.

Thanks
WeThotUWasAToadAsked:
Who is Participating?
 
Joe Winograd, Fellow&MVEConnect With a Mentor DeveloperCommented:
> When using scanning software, what are the best settings to get a good quality single-color (black) PDF rendering?

I do virtually all of my scanning of typical business docs at 300DPI, Black&White (1-bit). This nearly always produces an image that is good enough to perform high-accuracy OCR (depends, of course, on the quality of the source doc and the capabilities of the OCR software).

> Also, although the resulting file name is PDF (IMG.pdf), it seems more like a picture than a typical PDF doc that I'm used to (if that makes sense).

It's almost surely a PDF image file. Nothing wrong with that, as that is what one would expect of a scanned doc — it is a bitmap/graphic inside a PDF file. If you run OCR on it (either at scanning time or after scanning), you could create a PDF searchable image file, which has both the scanned image (bitmap) and the (searchable) text created by the OCR process.

> For example, the white areas on the sheet (ie those with no type) have all of these gray-looking smudges which don't really exist on the original paper.

I don't have the Canon CanoScan LiDE 110, but that may be due to lack of proper maintenance/cleaning of the scanner. It could also be due to having too dark of a brightness setting when scanning.

> In case it's any help, the file size is 982 KB.

A typical page scanned at 300DPI/B&W(1-bit) with typical compression should be about 50KB. At 982KB, this means that it is higher than 300DPI and/or not B&W and/or not compressed. I don't use the Canon MP Navigator EX 4.0 software, but check the settings in there.

> FYI, my main objective is to have settings which enable an assistant to scan a variety of documents (like stuff that comes in the mail) quickly and in a way that they can be filed as simple digital items in individual folders.

Again, I don't use the Canon MP Navigator EX 4.0 software, but any decent imaging/scanning package will let you do that.

> The majority of the items will never need to be accessed but I would like them to be OCR-able in order to facilitate searches.

Good quality OCR will (usually) have no problem with 300DPI/B&W images.

To learn more about scanning settings, I recommend Wayne Fulton's excellent site, "A few scanning tips":
http://www.scantips.com/

For OCR tips, look at this section of his site:
http://www.scantips.com/basics04.html

Regards, Joe
0
 
WeThotUWasAToadAuthor Commented:
Great response. Thanks!
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
You're welcome. Happy to help. Regards, Joe
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.