Solved

Best scanner software settings to get a good quality single-color (black) PDF rendering

Posted on 2014-04-17
3
663 Views
Last Modified: 2014-04-17
Hello,

When using scanning software, what are the best settings to get a good quality single-color (black) PDF rendering?

I am using the following:

        • Canon CanoScan LiDE 110 flatbed scanner
        • Canon MP Navigator EX 4.0 software

I set things up and then did a test scan using a single sheet of white paper containing black text. It scanned the paper but it seems to have taken longer than I would think it should take. Also, although the resulting file name is PDF (IMG.pdf), it seems more like a picture than a typical PDF doc that I'm used to (if that makes sense). For example, the white areas on the sheet (ie those with no type) have all of these gray-looking smudges which don't really exist on the original paper.

In case it's any help, the file size is 982 KB.

FYI, my main objective is to have settings which enable an assistant to scan a variety of documents (like stuff that comes in the mail) quickly and in a way that they can be filed as simple digital items in individual folders. The majority of the items will never need to be accessed but I would like them to be OCR-able in order to facilitate searches.

Thanks
0
Comment
Question by:WeThotUWasAToad
  • 2
3 Comments
 
LVL 51

Accepted Solution

by:
Joe Winograd, EE MVE earned 500 total points
ID: 40007340
> When using scanning software, what are the best settings to get a good quality single-color (black) PDF rendering?

I do virtually all of my scanning of typical business docs at 300DPI, Black&White (1-bit). This nearly always produces an image that is good enough to perform high-accuracy OCR (depends, of course, on the quality of the source doc and the capabilities of the OCR software).

> Also, although the resulting file name is PDF (IMG.pdf), it seems more like a picture than a typical PDF doc that I'm used to (if that makes sense).

It's almost surely a PDF image file. Nothing wrong with that, as that is what one would expect of a scanned doc — it is a bitmap/graphic inside a PDF file. If you run OCR on it (either at scanning time or after scanning), you could create a PDF searchable image file, which has both the scanned image (bitmap) and the (searchable) text created by the OCR process.

> For example, the white areas on the sheet (ie those with no type) have all of these gray-looking smudges which don't really exist on the original paper.

I don't have the Canon CanoScan LiDE 110, but that may be due to lack of proper maintenance/cleaning of the scanner. It could also be due to having too dark of a brightness setting when scanning.

> In case it's any help, the file size is 982 KB.

A typical page scanned at 300DPI/B&W(1-bit) with typical compression should be about 50KB. At 982KB, this means that it is higher than 300DPI and/or not B&W and/or not compressed. I don't use the Canon MP Navigator EX 4.0 software, but check the settings in there.

> FYI, my main objective is to have settings which enable an assistant to scan a variety of documents (like stuff that comes in the mail) quickly and in a way that they can be filed as simple digital items in individual folders.

Again, I don't use the Canon MP Navigator EX 4.0 software, but any decent imaging/scanning package will let you do that.

> The majority of the items will never need to be accessed but I would like them to be OCR-able in order to facilitate searches.

Good quality OCR will (usually) have no problem with 300DPI/B&W images.

To learn more about scanning settings, I recommend Wayne Fulton's excellent site, "A few scanning tips":
http://www.scantips.com/

For OCR tips, look at this section of his site:
http://www.scantips.com/basics04.html

Regards, Joe
0
 

Author Closing Comment

by:WeThotUWasAToad
ID: 40007590
Great response. Thanks!
0
 
LVL 51

Expert Comment

by:Joe Winograd, EE MVE
ID: 40007626
You're welcome. Happy to help. Regards, Joe
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

If your app took Google’s lash recently, here are the 5 most likely reasons.
Skype is a P2P (Peer to Peer) instant messaging and VOIP (Voice over IP) service – as well as a whole lot more.
Viewers will learn key ranges in Sampler to make their sampled instruments sound more realistic Gather samples of various notes and drag them to Key Range panel: Set proper root key for each sample: Select all the samples with Command-A (or Ctrl…
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now