Solved

PDF mega scan and auto rename

Posted on 2012-03-23
9
740 Views
Last Modified: 2012-04-02
I am interested to know if and how I could scan 4,000 completed hard copy forms to PDF (all in one mega-scan) that look like this...

https://docs.google.com/file/d/0B9Ga3bzjO-rUVUVOZnhyanpTU213WWdyQThXTmZRZw/edit

and end up with 4,000 individual PDFs auto renamed by record number (i.e. record number 22 = 22.pdf).
0
Comment
Question by:K_Deutsch
  • 6
  • 3
9 Comments
 
LVL 53

Accepted Solution

by:
Joe Winograd, EE MVE earned 500 total points
ID: 37759008
OK, me again. :)   And IrfanView again. :)

File>Select Scan/TWAIN Source...

Pick your scanner/driver

File>Acquire/Batch scanning

Select the <Multiple images (Batch Mode)> button

Set <Output file name> to blank

Set starting counter to 1

Set increment to 1

Set number of digits to 4

Set <Destination directory> to whatever you want

Set <Save as> to PDF

It should look like this:
IrfanView-multi-image-scanClick the Options button

Click General tab

For the <Preview of PDF during save operation> option, select <not needed>

Make sure <Save all pages from original image is checked> and <Open PDF after saving> is not checked

It should look like this:
IrfanView-preview-not-neededClick OK, OK, and your TWAIN or WIA scanning dialog will appear

Perform the scan

IrfanView will create the 4,000 xxxx.pdf files in the folder you chose. If the scan gets interrupted and you need to restart, you can pick up where you left off by setting the <Starting counter> to whatever you need in the <Acquire/Batch Scanning> screen shown above. When you're done scanning, IrfanView will ask if you want to save the image changes:
IrfanView-save-image-changes-say-NO Say NO! The xxxx.pdf files have already been saved. Regards, Joe
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 37759070
Btw, this assumes that the record numbers are in order. It is simply naming the records from <0001.pdf> to <4000.pdf> as the scanning occurs. In other words, this process is not reading the content, i.e., it is not OCR'ing the record number and putting it in the file name. That's a whole different level of difficulty! Regards, Joe
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 37759081
My comment above said, "When you're done scanning, IrfanView will ask if you want to save the image changes:". That's not quite true. It will ask you that when you exit IrfanView. In any case, say NO!
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:K_Deutsch
ID: 37759082
I feel bad because my explanation has been poor and incomplete. What really is happening here is that we are sending out a total of 4,000 "response requested" type forms, each with a unique record number. We of course won't get all 4,000 back. After I scan-in, I am wanting to have an automated process that goes through and recognizes the record number somehow and renames the pdf appropriately.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 37759345
Ah, as I indicated, whole different animal! :)   You're going to need a scanning/OCR package capable of OCR'ing that portion of the form and then saving the scanned document with the file name that was OCR'ed. Two excellent OCR packages which can probably do it (but I can't personally attest to it) are ABBYY FineReader and Nuance's OmniPage:

http://www.abbyy.com/
http://www.nuance.com/for-business/by-product/omnipage/index.htm

I'll give it some more thought, but start with those two. This is not going to be as simple as your first question. :)   Regards, Joe
0
 

Author Comment

by:K_Deutsch
ID: 37778655
I have not abandoned this. The software products you mentioned are too large in scope, I think, though I did get a trial of FineReader. I may poke around in that, but beyond that, we are currently using KnowledgeLake Capture. Onsite IT folks say it may be the answer, but I may not have time to wait on them.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 37778929
Two comments:

1. I don't know much about KnowledgeLake Capture, but I do know that it has some type of advanced OCR data extraction that may be able to do what you want with the current form.

2. Unless #1 turns out to be easy, or at least doable, my suggestion is to make a new form for future mailings. (If #1 is not doable, then you'll have to handle manually the forms that have already been mailed.) The new form should have a bar code on it that has the unique document number of each form. KnowledgeLake Capture supports barcode recognition, both as a document separator and in advanced capture. I can't be certain of this, having never used it or seen the manual, but it is very likely that KnowledgeLake Capture's barcode recognition capabilities can do what you need, i.e., recognize the unique number in each barcode and save that page as a separate PDF file, with the file name being the number in the barcode.

Regards, Joe
0
 

Author Closing Comment

by:K_Deutsch
ID: 37799314
The answer I picked as the accepted solution is based on my original question, which was vague. The solutions for my more clarified question are out of reach for what has to be a hit and run project for me or nothing. Thanks, Joe!
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 37799380
You're welcome! I hope you can achieve what you want with Knowledge Lake Capture...might be possible. Good luck! Regards, Joe
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I. Introduction In a previous article (http://www.experts-exchange.com/Web_Development/Document_Imaging/A_6537-PaperPort-Upgrade-How-to-download-and-install-updated-versions-of-PaperPort-11-and-12.html) (now deprecated), I discussed how to upgrad…
PaperPort (http://www.nuance.com/for-individuals/by-product/paperport/index.htm) is among the most important applications that I run on my Windows computers. I use it every day, for nearly all of my document and photo scanning, as well as most of my…
The goal of the tutorial is to teach the user how to import photos into Adobe Lightroom efficiently and to keep everything organized.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question