what dpi setting should be selected when scanning a document to PDF file?

Hello Everyone,

         When using HP Printer Assistant for HP Deskjet 2543, I am wondering what the desired dpi setting should be when scanning a document to PDF file.  The default setting is 200 dpi.  I wish to scan documents and save them as PDF files, copy the contents of each PDF file, and paste them into WordPress.   Seeing that these are text documents, I think I should select the Output type as either Greyscale or Black and White as opposed to Color.   Please feel free to correct me if I am wrong on that part.

           Any feedback given to this question will be appreciated.

           Thank you

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ste5anSenior DeveloperCommented:
Choose the DPI setting as you want. I guess the normal setting is 300 DPI to get also a good resolution of pictures.

Whether you use grayscale or color depends on your documents.

But read the manual carefully. Some scanners used an optimization by reusing image tiles. This may result in incorrect digitzed scanned documents.
Joe WinogradDeveloperCommented:
Hi George,

I worked in the high-end document management/imaging business for 20 years (million dollar solutions for Fortune Global 500 companies), and I can tell you that for most typical business documents, the customers scanned at 300DPI/black&white(1-bit). I've also been utilizing desktop document management/imaging for my personal use for more than 20 years, using the same 300DPI/B&W settings for most documents.

In the early years, I scanned to TIFF files (or files proprietary to the scanning software), while in more recent years, all my scanning is to PDF files, and nearly always to PDF Searchable Image files. Those are PDFs that have both a raster image and the text from OCR. The OCR makes them searchable, which is really important to me, and critical for your purpose of copying the text into WordPress.

OCR is usually accurate with 300DPI/B&W docs. Occasionally I'll scan at 600DPI, but, counter-intuitively, 600DPI often results in less accurate OCR than 300DPI. I'll also occasionally scan at 200-300DPI/grayscale(8-bit). You should experiment with your documents to see what settings create the most accurate OCR (see additional comment below about experimenting).

I suggest that you take a look at Wayne Fulton's excellent site, A few scanning tips. In particular, read carefully his OCR tips. Note, especially, the paragraph in yellow (a portion of which is copied here under "Fair Use"):
Most OCR software will want to scan at 300 dpi in line art mode, and line art is faster too.
He goes on to say, "Do experiment." As alluded to above, I strongly agree with that!

I hope this helps, and if you have any other questions about scanning your documents, I'll be happy to give you my thoughts on it. Regards, Joe

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Dave BaldwinFixer of ProblemsCommented:
To reinforce Joe's comments, the original scan is an image, not text.  The image must be OCR'd to become text.
Exploring SharePoint 2016

Explore SharePoint 2016, the web-based, collaborative platform that integrates with Microsoft Office to provide intranets, secure document management, and collaboration so you can develop your online and offline capabilities.

Mal OsborneAlpha GeekCommented:
200 DPI is fine for documents with clear text of a reasonable size. A page of 10 or 12 point laser print will be perfectly legible. 200DPI is the same as a fax on fine.

If you have documents with fine print, small logos that need to be read, tables with numbers in 8 point, or other fine features, then 300 or 400 might be called for.

Similarly, if the documents are black and white, (or dark and light), the 1 bit colour is fine, if you need shades of grey or colour, to produce a legible scan, then you will have to select that.
Joe WinogradDeveloperCommented:
My opinion is not to use 200DPI/B&W for OCR. You will get more accurate results with 300DPI/B&W. Indeed, 200DPI is what a "fine" fax is and that's one reason for getting a lot of OCR errors when performing OCR on a fax (even a "fine" one; and, of course, it's worse on a "standard" fax). Regards, Joe
GMartinAuthor Commented:
Hello Everyone,

        Thank you for your suggestions.  At this point, I have decided to upgrade my hardware by purchasing a printer/scanner/copier which supports an automatic document feeder as indicated with one of my previously closed post.  In the meantime, I believe I will experiment with my HP Deskjet 2543 that only has a flatbed in order to get a better idea about the dpi settings and its impact upon the resolution of a scanned text document.  

          Once again, thanks again everyone for your help : - )  I will create a new post if any further questions or concerns should come up.  

Joe WinogradDeveloperCommented:
Hi George,
That is an excellent decision to buy a scanner with an ADF. And also a good idea to experiment with DPI (and other settings) on your 2543. Regards, Joe
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Document Imaging

From novice to tech pro — start learning today.