?
Solved

PDF mega scan and auto rename

Posted on 2012-03-23
9
Medium Priority
?
757 Views
Last Modified: 2012-04-02
I am interested to know if and how I could scan 4,000 completed hard copy forms to PDF (all in one mega-scan) that look like this...

https://docs.google.com/file/d/0B9Ga3bzjO-rUVUVOZnhyanpTU213WWdyQThXTmZRZw/edit

and end up with 4,000 individual PDFs auto renamed by record number (i.e. record number 22 = 22.pdf).
0
Comment
Question by:K_Deutsch
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 3
9 Comments
 
LVL 55

Accepted Solution

by:
Joe Winograd, EE MVE 2015&2016 earned 2000 total points
ID: 37759008
OK, me again. :)   And IrfanView again. :)

File>Select Scan/TWAIN Source...

Pick your scanner/driver

File>Acquire/Batch scanning

Select the <Multiple images (Batch Mode)> button

Set <Output file name> to blank

Set starting counter to 1

Set increment to 1

Set number of digits to 4

Set <Destination directory> to whatever you want

Set <Save as> to PDF

It should look like this:
IrfanView-multi-image-scanClick the Options button

Click General tab

For the <Preview of PDF during save operation> option, select <not needed>

Make sure <Save all pages from original image is checked> and <Open PDF after saving> is not checked

It should look like this:
IrfanView-preview-not-neededClick OK, OK, and your TWAIN or WIA scanning dialog will appear

Perform the scan

IrfanView will create the 4,000 xxxx.pdf files in the folder you chose. If the scan gets interrupted and you need to restart, you can pick up where you left off by setting the <Starting counter> to whatever you need in the <Acquire/Batch Scanning> screen shown above. When you're done scanning, IrfanView will ask if you want to save the image changes:
IrfanView-save-image-changes-say-NO Say NO! The xxxx.pdf files have already been saved. Regards, Joe
0
 
LVL 55

Expert Comment

by:Joe Winograd, EE MVE 2015&2016
ID: 37759070
Btw, this assumes that the record numbers are in order. It is simply naming the records from <0001.pdf> to <4000.pdf> as the scanning occurs. In other words, this process is not reading the content, i.e., it is not OCR'ing the record number and putting it in the file name. That's a whole different level of difficulty! Regards, Joe
0
 
LVL 55

Expert Comment

by:Joe Winograd, EE MVE 2015&2016
ID: 37759081
My comment above said, "When you're done scanning, IrfanView will ask if you want to save the image changes:". That's not quite true. It will ask you that when you exit IrfanView. In any case, say NO!
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 

Author Comment

by:K_Deutsch
ID: 37759082
I feel bad because my explanation has been poor and incomplete. What really is happening here is that we are sending out a total of 4,000 "response requested" type forms, each with a unique record number. We of course won't get all 4,000 back. After I scan-in, I am wanting to have an automated process that goes through and recognizes the record number somehow and renames the pdf appropriately.
0
 
LVL 55

Expert Comment

by:Joe Winograd, EE MVE 2015&2016
ID: 37759345
Ah, as I indicated, whole different animal! :)   You're going to need a scanning/OCR package capable of OCR'ing that portion of the form and then saving the scanned document with the file name that was OCR'ed. Two excellent OCR packages which can probably do it (but I can't personally attest to it) are ABBYY FineReader and Nuance's OmniPage:

http://www.abbyy.com/
http://www.nuance.com/for-business/by-product/omnipage/index.htm

I'll give it some more thought, but start with those two. This is not going to be as simple as your first question. :)   Regards, Joe
0
 

Author Comment

by:K_Deutsch
ID: 37778655
I have not abandoned this. The software products you mentioned are too large in scope, I think, though I did get a trial of FineReader. I may poke around in that, but beyond that, we are currently using KnowledgeLake Capture. Onsite IT folks say it may be the answer, but I may not have time to wait on them.
0
 
LVL 55

Expert Comment

by:Joe Winograd, EE MVE 2015&2016
ID: 37778929
Two comments:

1. I don't know much about KnowledgeLake Capture, but I do know that it has some type of advanced OCR data extraction that may be able to do what you want with the current form.

2. Unless #1 turns out to be easy, or at least doable, my suggestion is to make a new form for future mailings. (If #1 is not doable, then you'll have to handle manually the forms that have already been mailed.) The new form should have a bar code on it that has the unique document number of each form. KnowledgeLake Capture supports barcode recognition, both as a document separator and in advanced capture. I can't be certain of this, having never used it or seen the manual, but it is very likely that KnowledgeLake Capture's barcode recognition capabilities can do what you need, i.e., recognize the unique number in each barcode and save that page as a separate PDF file, with the file name being the number in the barcode.

Regards, Joe
0
 

Author Closing Comment

by:K_Deutsch
ID: 37799314
The answer I picked as the accepted solution is based on my original question, which was vague. The solutions for my more clarified question are out of reach for what has to be a hit and run project for me or nothing. Thanks, Joe!
0
 
LVL 55

Expert Comment

by:Joe Winograd, EE MVE 2015&2016
ID: 37799380
You're welcome! I hope you can achieve what you want with Knowledge Lake Capture...might be possible. Good luck! Regards, Joe
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In a previous article published here at Experts Exchange, Signature Image with Transparent Background (http://www.experts-exchange.com/Web_Development/Document_Imaging/A_12380-Signature-Image-with-Transparent-Background.html), I explained how to cre…
When the confidentiality and security of your data is a must, trust the highly encrypted cloud fax portfolio used by 12 million businesses worldwide, including nearly half of the Fortune 500.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
This video Micro Tutorial shows how to password-protect PDF files with free software. Many software products can do this, such as Adobe Acrobat (but not Adobe Reader), Nuance PaperPort, and Nuance Power PDF, but they are not free products. This vide…
Suggested Courses

764 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question