Solved

PDF Document Scanner & Text Search Software

Posted on 2011-09-20
7
348 Views
Last Modified: 2012-05-12
Hi I would like to know what the cheapest software available to scan up to 250 pages of a hard copy text book and have it converted to a pdf doc. I need to then be able to electronically search for any text once it has been converted to pdf.
0
Comment
Question by:FrankSasso
  • 5
  • 2
7 Comments
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 36570755
I've been using PaperPort for more than 15 years:

http://www.nuance.com/for-individuals/by-product/paperport/index.htm

The latest version is PP14, which just came out on 2-Aug. The main enhancement is cloud support, which you probably don't need. The new version is fairly expensive, but you can get the previous version, which is 12 (yes, they were superstitious and skipped 13), as a download at Newegg for $39.99:

http://www.newegg.com/Product/Product.aspx?Item=N82E168168677800SF

It can automatically make PDF Searchable Image files, meaning that it automatically invokes built-in OCR to create a layer of text (searchable!) which resides in the PDF file along with the scanned image. You can then search it with the All-in-One Search that is built into PaperPort or with any search engine that can index PDFs with text, such as dtSearch (not free), Google Desktop Search (free), X1 (not free), or Windows Search 4 (free).

Btw, the Newegg download is likely to be 12.0. Do not install that. Instead, read my EE article on how to upgrade to 12.1 (free!):

http://www.experts-exchange.com/Web_Development/Document_Imaging/A_6537-PaperPort-Upgrade-How-to-download-and-install-updated-versions-of-PaperPort-11-and-12.html

Regards, Joe
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 36570762
I should have added as a disclaimer that I have no affiliation with this company and no financial interest in it whatsoever. I am simply a happy user/customer. Regards, Joe
0
 

Author Comment

by:FrankSasso
ID: 36570853
Hi Joe, thanks for your information however Im assuming that i need to buy some type of scanner to use this software?
0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 36570953
Hi Frank,
Based on your question, I assumed that you already have a scanner and are looking for software that is capable of creating a PDF with searchable text. Is that right? If so, what scanner do you have? If not, then you'll need both hardware (scanner) and software (which often comes bundled with scanners, but isn't always robust, and many times can't create searchable PDFs – hence the need for a third-party package). Regards, Joe
0
 

Author Comment

by:FrankSasso
ID: 36571061
Hi Joe, i have a BROTHER MFC 7340 which allows me to scan docs which is fine when Im scanning a small amt of docs, but because I intend to scan hard copy books which are not loose leaf pages, I may need to look at one of those scanners you can buy that you hold the scanner in your hand and just pass it over the doc, what do you think?
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 36571153
Frank,

I have a Brother MFC-3820CN, MFC-7820N, and MFC-9840CDW, so I know exactly what you mean – doing a book on the flatbed is no picnic! I looked into book scanners a while ago. My favorite is the TREVENTUS ScanRobot:

http://www.treventus.com/bookscanner_pageturner.html

But I couldn't afford the $100,000 for it. :)  You must look at the videos for this thing – they will knock your socks off:

http://www.treventus.com/products/scanrobotr-20-mds/videos.html

The five videos total nine minutes – trust me, it's worth it!

Now, back to reality. After my wife decided that the ScanRobot was not the best way to spend our life savings, I looked into scanning services that specialize in books. Here are a couple I found (I'm sure there are plenty others):

http://bookscanning.com/
http://www.blueleaf-book-scanning.com/index.html

Even the TREVENTUS folks have a "Scanservice":

http://www.treventus.com/scanservice.html

I also found a really interesting site called Do-It-Yourself Book Scanning:

http://www.diybookscanner.org/

In the end, my books are still sitting on the shelves, un-scanned, so I can't give you any wisdom on what worked and what didn't.

Of course, it you're willing to destroy the original book (not usually the case), you can remove the binding and then put the pages through an ADF. Great, but only if you don't care about the original book. As far as your idea of passing a hand scanner over each page of the book, I think that may be as painful as using a flatbed.

Regards, Joe
0
 
LVL 53

Accepted Solution

by:
Joe Winograd, EE MVE earned 125 total points
ID: 36574849
Frank,

One other thing. If you really want to give a hand scanner a try, the VuPoint Solutions Magic Wand Portable Scanner (PDS-ST410-VP) looks interesting and is relatively inexpensive:

http://www.amazon.com/VuPoint-Solutions-Portable-Scanner-PDS-ST410-VP/dp/B002R0BFAA

But I'm still having a tough time wrapping my head around this technique for a several hundred page book. Regards, Joe
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Acrobat’s JavaScript is a great tool to extend the application, or to automate recurring tasks. There are several ways a JavaScript can be added to the application or a document (e.g. folder level scripts, validation scripts, event handling scripts,…
Can Be Caused By Disabled Services I have encountered a problem viewing PDF files using Adobe Acrobat Reader.  For the longest time, PDFs might launch or might not.  Sometimes they took about 15 minutes to appear after launching them. After som…
In this video, we show how to convert an image-only PDF file into a PDF Searchable Image file, that is, a file with both the image (typically from scanning) and text, which is created in an automated fashion with Optical Character Recognition (OCR) …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

789 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question