Solved

PDF mega scan and auto rename

Posted on 2012-03-23
9
734 Views
Last Modified: 2012-04-02
I am interested to know if and how I could scan 4,000 completed hard copy forms to PDF (all in one mega-scan) that look like this...

https://docs.google.com/file/d/0B9Ga3bzjO-rUVUVOZnhyanpTU213WWdyQThXTmZRZw/edit

and end up with 4,000 individual PDFs auto renamed by record number (i.e. record number 22 = 22.pdf).
0
Comment
Question by:K_Deutsch
  • 6
  • 3
9 Comments
 
LVL 51

Accepted Solution

by:
Joe Winograd, EE MVE earned 500 total points
ID: 37759008
OK, me again. :)   And IrfanView again. :)

File>Select Scan/TWAIN Source...

Pick your scanner/driver

File>Acquire/Batch scanning

Select the <Multiple images (Batch Mode)> button

Set <Output file name> to blank

Set starting counter to 1

Set increment to 1

Set number of digits to 4

Set <Destination directory> to whatever you want

Set <Save as> to PDF

It should look like this:
IrfanView-multi-image-scanClick the Options button

Click General tab

For the <Preview of PDF during save operation> option, select <not needed>

Make sure <Save all pages from original image is checked> and <Open PDF after saving> is not checked

It should look like this:
IrfanView-preview-not-neededClick OK, OK, and your TWAIN or WIA scanning dialog will appear

Perform the scan

IrfanView will create the 4,000 xxxx.pdf files in the folder you chose. If the scan gets interrupted and you need to restart, you can pick up where you left off by setting the <Starting counter> to whatever you need in the <Acquire/Batch Scanning> screen shown above. When you're done scanning, IrfanView will ask if you want to save the image changes:
IrfanView-save-image-changes-say-NO Say NO! The xxxx.pdf files have already been saved. Regards, Joe
0
 
LVL 51

Expert Comment

by:Joe Winograd, EE MVE
ID: 37759070
Btw, this assumes that the record numbers are in order. It is simply naming the records from <0001.pdf> to <4000.pdf> as the scanning occurs. In other words, this process is not reading the content, i.e., it is not OCR'ing the record number and putting it in the file name. That's a whole different level of difficulty! Regards, Joe
0
 
LVL 51

Expert Comment

by:Joe Winograd, EE MVE
ID: 37759081
My comment above said, "When you're done scanning, IrfanView will ask if you want to save the image changes:". That's not quite true. It will ask you that when you exit IrfanView. In any case, say NO!
0
 

Author Comment

by:K_Deutsch
ID: 37759082
I feel bad because my explanation has been poor and incomplete. What really is happening here is that we are sending out a total of 4,000 "response requested" type forms, each with a unique record number. We of course won't get all 4,000 back. After I scan-in, I am wanting to have an automated process that goes through and recognizes the record number somehow and renames the pdf appropriately.
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 51

Expert Comment

by:Joe Winograd, EE MVE
ID: 37759345
Ah, as I indicated, whole different animal! :)   You're going to need a scanning/OCR package capable of OCR'ing that portion of the form and then saving the scanned document with the file name that was OCR'ed. Two excellent OCR packages which can probably do it (but I can't personally attest to it) are ABBYY FineReader and Nuance's OmniPage:

http://www.abbyy.com/
http://www.nuance.com/for-business/by-product/omnipage/index.htm

I'll give it some more thought, but start with those two. This is not going to be as simple as your first question. :)   Regards, Joe
0
 

Author Comment

by:K_Deutsch
ID: 37778655
I have not abandoned this. The software products you mentioned are too large in scope, I think, though I did get a trial of FineReader. I may poke around in that, but beyond that, we are currently using KnowledgeLake Capture. Onsite IT folks say it may be the answer, but I may not have time to wait on them.
0
 
LVL 51

Expert Comment

by:Joe Winograd, EE MVE
ID: 37778929
Two comments:

1. I don't know much about KnowledgeLake Capture, but I do know that it has some type of advanced OCR data extraction that may be able to do what you want with the current form.

2. Unless #1 turns out to be easy, or at least doable, my suggestion is to make a new form for future mailings. (If #1 is not doable, then you'll have to handle manually the forms that have already been mailed.) The new form should have a bar code on it that has the unique document number of each form. KnowledgeLake Capture supports barcode recognition, both as a document separator and in advanced capture. I can't be certain of this, having never used it or seen the manual, but it is very likely that KnowledgeLake Capture's barcode recognition capabilities can do what you need, i.e., recognize the unique number in each barcode and save that page as a separate PDF file, with the file name being the number in the barcode.

Regards, Joe
0
 

Author Closing Comment

by:K_Deutsch
ID: 37799314
The answer I picked as the accepted solution is based on my original question, which was vague. The solutions for my more clarified question are out of reach for what has to be a hit and run project for me or nothing. Thanks, Joe!
0
 
LVL 51

Expert Comment

by:Joe Winograd, EE MVE
ID: 37799380
You're welcome! I hope you can achieve what you want with Knowledge Lake Capture...might be possible. Good luck! Regards, Joe
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

This article was inspired by a question here at Experts Exchange (http://www.experts-exchange.com/Software/Photos_Graphics/Images_and_Photos/Q_28629170.html). The requirements stated in that question are (1) reduce the file size of a large number of…
In a previously published article (http://www.experts-exchange.com/articles/10331/Automatic-Duplex-Scanning-in-PaperPort-Versions-11-12-14.html) here at Experts Exchange, I explained how to achieve duplex (double-sided) scanning in Nuance's PaperPor…
The goal of the tutorial is to teach the user how to use import presets downloaded from the internet into Adobe Lightroom. Once you downloaded the presets go into the preset folder and press import then import your preset and your set it to go.
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now