OCR

516

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Share tech news, updates, or what's on your mind.

Sign up to Post

PaperPort 14.5 Patch 1 update is often not detected or downloaded automatically. This article provides direct download links to solve the problem for retail (non-bundled) versions of the Standard and Professional editions, as well as the Professional edition in Nuance's own OmniPage Ultimate bundle.
11
 

Expert Comment

by:Marshall Kass
Comment Utility
Thank you.  It's a shame when software developers make a simple task difficult.  Your solution was spot-on and easy as pie.  Kudos
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
You're welcome, Marshalk — and thanks to you for the compliment and the article upvote. I appreciate both! Regards, Joe
0
[Webinar] How Hackers Steal Your Credentials
LVL 9
[Webinar] How Hackers Steal Your Credentials

Do You Know How Hackers Steal Your Credentials? Join us and Skyport Systems to learn how hackers steal your credentials and why Active Directory must be secure to stop them. Thursday, July 13, 2017 10:00 A.M. PDT

PaperPort is among the most important applications that I run on my Windows computers. I use it every day, for nearly all of my document and photo scanning, as well as most of my document and photo imaging, including OCR via its built-in OmniPage capabilities.

Disclaimer before going further: I have no affiliation with Nuance and no financial interest in it whatsoever. I am simply a happy user/customer.

I've been using PaperPort for around 20 years on every version of Windows since Windows 95. With the Windows 10 release date coming up in two days, I thought it would be worthwhile to document my experience with PaperPort on the Windows 10 Technical Preview, including some tips for successful deployment on W10.

First, my experience with the various builds along the way: I did not install PaperPort on the initial Windows 10 Technical Preview of Build 9841, released on 30-Sep-2014. But I installed on every build after that, from 9860 through the current 10240. The platform is physical hardware, not a virtual machine. It is a relatively old laptop with mediocre specs by today's standards:

Intel Core2 Duo T9300 2.50GHz
4GB RAM DDR2 PC5300
Samsung SSD 840 EVO 250GB (with the read performance firmware upgrade
14
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
You're welcome, Dana. I'm hoping for a PP15 (or maybe just PP14.6) that Nuance certifies and officially supports for W10. It's even possible that Nuance would certify PP14.5 for W10. I'll post here as soon as I learn anything official from Nuance about PP in W10. In the meantime, there's been some discussion about it in the Google PaperPort Group (and its PaperPort wiki). You may want to check in on that. Regards, Joe
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
In a comment above, I mentioned that the Patch 1 update did not work with the PaperPort Professional 14.5 that is included with Nuance's own OmniPage Ultimate bundle. I am pleased to report that Nuance has finally fixed this, although there is still a glitch in the process such that, in some cases, the Common Software Update Manager does not perform the update, even though it detects its existence. The solution, once again, is to get the installer from a direct download link. I just published an article here at EE explaining the method:
How to install the Patch 1 update for the PaperPort Professional 14.5 bundled with OmniPage Ultimate

The article also shows a list of the support tickets that Patch 1 fixes. Regards, Joe
0
In a previously published article here at Experts Exchange, I explained how to achieve duplex (double-sided) scanning in Nuance's PaperPort software with a hardware-capable duplex scanner, that is, a scanner which has an Automatic Document Feeder (ADF) capable of scanning both sides of a document. A recent question here at EE prompted me to write this additional article, which explains how to achieve duplex scanning in PaperPort with a simplex scanner, that is, a scanner whose ADF is capable of scanning only the front side of a document.

As with the previous article, this one applies to the three most recent versions of PaperPort, i.e., 11, 12, and 14 — yes, Nuance got superstitious and did not release a version 13.

Here are the steps to achieve duplex scanning in PaperPort (either Standard or Professional) with a simplex scanner:
 
  • Click the Scan Settings button on the Ribbon in PP12 and PP14, or the Scan or Get Photo icon on the toolbar in PP11. You will now have the Scan or Get Photo pane:

Scan-or-Get-Photo.jpg 
  • Select a Scanner and a Scanning Profile.
 
  • Tick the Show Capture Assistant box.
 
  • Place the document in the (simplex) ADF and click the Scan button.
 
  • In PaperPort Standard, you will get this:

front-side-PP-Std.jpg 
  • In PaperPort Professional, you will get this:

front-side-PP-Pro.jpg 
  • Remove the document from the output tray, turn it over so that the last page is on the top, place it in the ADF, and click the Scan Other Side button.
 
  • In PaperPort Standard, you will get this:

after-Scan-Other-Side-PP-Std.jpg 
  • In
2
PaperPort is a popular document imaging/management product from Nuance Communications. It is in widespread use by both individuals and businesses.

The current version of PaperPort is 14. The previous version was 12 (yes, Nuance got superstitious and skipped 13). Both of these most recent versions come in two editions, Professional and Standard. All four products — PP12 Standard, PP12 Professional, PP14 Standard, PP14 Professional — have the ability to create a searchable PDF file without any other software needing to be installed. PP12 was the first release that could do this (and it was carried forward into PP14).

Prior PaperPort releases require Nuance's OmniPage (a separately priced OCR product) to be installed in order to create a searchable PDF file that PaperPort calls a PDF Searchable Image file (because it contains both the raster image and the text created by OCR). The reason that PP12 and PP14 can create a PDF Searchable Image file is that it contains the OmniPage OCR engine under the covers — via the OmniPage Capture Software Development Kit (CSDK).
 
Sidebar on PaperPort Version: If you are running PP12.0, I recommend that you upgrade (free!) to PP12.1. This EE article explains how to do it:
PaperPort 12 - Free Upgrade to Version 12.1
If you are running PP14.0, PP14.1, or PP14.2, I recommend that you upgrade (free!) to PP14.5 (there was not a public release for either 14.3 or 14.4). This EE article explains how to do it:
2
 

Expert Comment

by:Don Green
Comment Utility
Wow.  I reinstalled Paperport 14 on a new computer build, so I knew that printing to Paperport can create a searchable PDF, but spent about an hour stumbling around trying to figure out / remember how to do that.  Of course, now I feel a little stupid because it seemed obvious once I followed your instructions.  Somewhere along the way I had a setting so that printing to Paperport created a horribly OCR'd document, then replaced the actual document with gibberish text.  It's stopped doing that, and I don't want to go back and figure out how I made it do that and how I made it stop.  But, people who find your article are lucky people.  Nuance was going to charge me $10 for this "simple" answer since I purchased more than $90.  I don't know how many people your article helps, but I know it's a beautifully done article that left me feeling enormous gratitude.
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Don,
You're very welcome. And thanks to you for joining EE today (welcome aboard!), as well as reading and endorsing my article — I really appreciate it! I'm glad you found it helpful. Regards, Joe
1
PaperPort
I. Introduction

In a previous article (now deprecated), I discussed how to upgrade — at no cost for licensed users — Nuance's PaperPort Version 11 (hereafter, PP11) and PaperPort Version 12 (PP12) to the latest "point" releases, namely, 11.2 and 12.1. At the time of that article's publication, PP11 and PP12 were the two latest versions. Now the latest version is PP14 (yes, Nuance was superstitious and skipped 13), and its latest "point" release is 14.5.

I decided that adding PP14 to the previous article would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I decided to create three separate articles for PP11, PP12, and PP14 users. This is the PP14 one.

The earlier point releases of PP14 — 14.0, 14.1, 14.2 (there was not a public release for either 14.3 or 14.4) — are known to have bugs that were fixed in 14.5. This article provides links to 14.5, as well as other useful information on upgrading.

II. Comparison of Standard and Professional Editions

For PP14, there are two consumer editions – Standard and Professional. The feature comparison matrix is available in the Files section of this PaperPort wiki:
http://sites.google.com/site/wikipaperport/files

Here is a direct link to the PDF:
Comparison Matrix of PP14 Standard and PP14 Professional

III. Links to Downloads

The links are to a direct download
9
 

Expert Comment

by:Jerry Richardson
Comment Utility
Hi Joe,
I installed 14.5 with no problems but Nuance will not verify my old registration number for version 12. Did I miss something?
Thanks, Jerry
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Jerry,
First, welcome to Experts Exchange! Thanks for joining today and reading my article. Now, to your issue.

A PP12 license is not valid for PP14. You must purchase a license for PP14. Once you have a PP14 license, you may use its serial number to upgrade to PP14.5 at no charge, and then you may install Patch 1 at no charge. But using your PP12 serial number for a PP14 installation will not work.

My suggestion is to purchase PP14 at Amazon — it is inexpensive these days. The Standard Edition is $22:
https://www.amazon.com/Nuance-Communications-Inc-6809A-G00-140-Paperport/dp/B005CELKLM

The Professional Edition is $37:
https://www.amazon.com/Nuance-Communications-Inc-F309A-G00-140-Professional/dp/B005CELL1G

Both of those are for 14.0, but then use the instructions in this article to upgrade for free to 14.5. Actually, since you already installed 14.5, you may be able simply to use the serial number from the 14.0 purchase to register the installed 14.5. In either case, install Patch 1 after that, as described in this other EE article:
How to install the Patch 1 update for PaperPort 14.5

I also recommend installing the PP14 Scanner Connection Tool, as described in yet another EE article:
PaperPort 14 Scanner Connection Tool - Fix Scanning Problems in PaperPort 14

Regards, Joe
0
I. Introduction

In a previous article (now deprecated), I discussed how to upgrade — at no cost for licensed users — Nuance's PaperPort Version 11 (hereafter, PP11) and PaperPort Version 12 (PP12) to the latest "point" releases, namely, 11.2 and 12.1. At the time of that article's publication, PP11 and PP12 were the two latest versions. Now the latest version is PP14 (yes, Nuance was superstitious and skipped 13), and its latest "point" release is 14.5.

I decided that adding PP14 to the previous article would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I decided to create three separate articles for PP11, PP12, and PP14 users. This is the PP12 one.

The earlier point release of PP12 — 12.0 — is known to have bugs that were fixed in 12.1. The links in the previous article for 12.1 no longer work. This new article provides working links for 12.1, as well as other useful information on upgrading.

II. Comparison of Standard and Professional Editions

For PP12, there are two consumer editions – Standard and Professional. The feature comparison matrix is available in the Files section of this PaperPort wiki:
http://sites.google.com/site/wikipaperport/files

Here is a direct link to the PDF:
Comparison Matrix of PP12 Standard and PP12 Professional

III. New Links to Downloads

The new links are to a direct download
2
I. Introduction

In a previous article (now deprecated), I discussed how to upgrade — at no cost for licensed users — Nuance's PaperPort Version 11 (hereafter, PP11) and PaperPort Version 12 (PP12) to the latest "point" releases, namely, 11.2 and 12.1. At the time of that article's publication, PP11 and PP12 were the two latest versions. Now the latest version is PP14 (yes, Nuance was superstitious and skipped 13), and its latest "point" release is 14.5.

I decided that adding PP14 to the previous article would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I decided to create three separate articles for PP11, PP12, and PP14 users. This is the PP11 one.

The earlier point releases of PP11 — 11.0 and 11.1 — are known to have bugs that were fixed in 11.2. Although the links in the previous article for 11.2 still work, Nuance informed me that they may soon stop working. This new article provides working links for 11.2 that Nuance says will continue to work after the other ones have been taken down. This article also provides other useful information on upgrading.

II. Comparison of Standard and Professional Editions

For PP11, there are two consumer editions – Standard and Professional. The feature comparison matrix is available in the Files section of this PaperPort wiki:
http://sites.google.com/site/wikipaperport/files

Here is a direct link to the PDF:
2
PaperPort
This article discusses the PaperPort 14 Scanner Connection Tool, which Nuance provides at no charge in order to fix scanning problems in Windows 8. Furthermore, users of PaperPort 14 in Windows 7 and Windows 10 have reported that the tool works in those versions of Windows, too.
1
 

Expert Comment

by:Gregory Parsons
Comment Utility
This did not help.  Since the last auto-update in Windows 10 PaperPort Professional 14.5 won't recognize one of my two scanners.  Neither your site nor the manufacturer's latest drivers have been of any help - especially baffling as Nuance is the software bundled with the scanner.  The Xerox DocuMate 152i is no longer listed as a compatible scanner.
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Gregory,

First, I see that you joined EE today — welcome aboard! I'll do my best to try to help you.

Do you have Patch 1 installed? If not, I recommend doing that, as PP14.5/Patch1 is the only Windows 10-compliant version of PaperPort. This EE article explains how to get it (at no cost):
How to install the Patch 1 update for PaperPort 14.5

Reinstall the Scanner Connection Tool after installing Patch 1. If it still doesn't work, read this other EE article:
PaperPort 14 in Windows 10 - A First Look

Perhaps some of the Tips in there will help.

Of course, having a driver that works in your version of Windows (including bit level — 32-bit or 64-bit) is critical. I checked the Xerox website for your DocuMate 152i and it shows that it has drivers for W10, including ISIS, TWAIN, and WIA. You may download them from here:
http://www.xeroxscanners.com/en/us/products/drivers.asp?PN=97-0084-00U

Note this comment at that site: "You must uninstall your current driver and OneTouch software to install an updated driver."

I much prefer ISIS drivers over TWAIN and WIA, and PP14.5 fully supports ISIS, so try that first (nothing wrong with TWAIN and WIA, so try them, too). Also, reinstall the Scanner Connection Tool after reinstalling the scanning drivers. Regards, Joe
0
Power PDF is the newest product from the Document Imaging division of Nuance Communications. It is available in two editions — Power PDF Standard and Power PDF Advanced. Nuance offers a free 30-day trial of the Advanced edition, available for download here:
http://www.nuance.com/for-business/imaging-solutions/document-conversion/power-pdf-converter/free-trial/index.htm

For whatever reason, Nuance does not offer a free trial of the Standard edition. Everything in this article about Version 1.0 is based on my experience with the free trial of the Advanced edition running in W7 Pro 64-bit; everything about Version 1.1 is based on the licensed Power PDF Advanced, also on W7 Pro 64-bit.

Disclaimer before going further: I have no affiliation with Nuance and no financial interest in it whatsoever. I am simply a happy user/customer of several of its document imaging products, including OmniPage, PaperPort, and PDF Converter (the latter is being replaced by Power PDF).

Update on 22-Feb-2015: The initial submission of this article was about version 1.0 of Power PDF Advanced. Nuance recently released version 1.1, but since EE members may still have 1.0, I decided not to delete the information about 1.0, but instead added a section at the bottom of the article about 1.1.

Update on 18-Jun-2015
6
 

Expert Comment

by:Willem van der Plas
Comment Utility
Thanks for your outstanding help, Joe!
All the best from Europe,
Willem
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
You're very welcome, Willem. Unfortunately, I don't see that great deal on PPA2.0 at the Amazon UK site and don't know if you'll find it any Amazon EU site. Also, no deal on it at the Nuance UK site. Both sites have it at £139.99 — list price. :(  Good luck shopping! Cheers, Joe
0
Update 21-May-2015: I temporarily removed the source code to make major changes to the program. Regards, Joe

INTRODUCTION

This article presents a solution to a question asked here at Experts Exchange. The situation is that there's a large number of subfolders (400 in the original question), each of which has a number of PDF files (two in the original question). The goal is to combine/merge the PDF files in each subfolder (in ascending date order) into a single PDF file, storing the combined file in each subfolder. The source PDF files in each subfolder may have any file names and the user should be able to specify the file name of the combined file.

REQUIRED SOFTWARE

The method presented in this article requires AutoHotkey, an excellent (free!) programming/scripting language. The quick explanation for installing AutoHotkey is to visit its website. A more comprehensive explanation is to read my EE article, AutoHotkey - Getting Started. After installation, AutoHotkey will own the AHK file type, supporting the solution discussed in the remainder of this article.

The program utilizes another excellent (free!) piece of software — PDF Toolkit (PDFtk). It comes in both command line and GUI versions. The command line version is called PDFtk Server
7
 

Expert Comment

by:Centex Aps
Comment Utility
Hi

Will the "Combine-Merge-PDF-files-20140826.ahk"  file not be attached again?
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Centex,
I've decided not to post the full program. I'll be rewriting the article as a "design roadmap" with some crucial code snippets, such as how to call PDFtk Server, but will not be posting the complete source code. Regards, Joe
0
Free Tool: IP Lookup
LVL 9
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

The standard (non-Professional) edition of PaperPort from Nuance Communications (previously known as ScanSoft) is limited to five Scanning Profiles, but in a previous article, I discussed how to overcome this limitation. The technique presented in that article may also be used to address an issue that I've been asked many times by PaperPort users, namely, how to reorder the Scanning Profiles in the Scan or Get Photo pane.

For users with many Scanning Profiles, it is desirable to order the list such that the more frequently used ones are at the top. Unfortunately, PaperPort 12, the previous release, offers no ability to rearrange the order of the Scanning Profiles. PaperPort 14, the current release (yes, Nuance got superstitious and skipped 13), added a little-known, undocumented feature that helps: drag-and-drop of the Scanning Profiles. However, it works poorly, in my opinion, as I find it difficult to drop the profile exactly where I want it.

Also, in both PP12 and PP14, there is no ability to search for a Scanning Profile, so the user must scroll through the list to find the desired profile. This article presents an approach for reordering the list, using the same method presented in my previous article, PaperPort - How To Achieve More Than Five Scanning Profiles in the Standard Edition.

All of the screenshots in this article are from PaperPort Professional 14
1
PaperPort is a popular document imaging/management product from Nuance Communications, previously known as ScanSoft. PaperPort is in widespread use by both individuals and businesses.

The current version of PaperPort is 14. The previous version was 12. Yes, Nuance got superstitious and skipped 13. Both of these most recent versions come in two editions, Professional and Standard, although the Nuance folks do not call it Standard – they simply leave Professional off the name, i.e., PaperPort 12 and PaperPort Professional 12; PaperPort 14 and PaperPort Professional 14. In this article, I refer to them as PP-Std and PP-Pro, and all such references are valid for versions 12 and 14.

There are numerous differences between PP-Std and PP-Pro. The comparison matrices may be seen in the Files section at this PaperPort wiki in these files:

Comparison Matrix of PP12 Standard and PP12 Professional.pdf
Comparison Matrix of PP14 Standard and PP14 Professional.pdf

As shown in the documents above, one of the differences between PP-Std and PP-Pro is that the former allows only five Scanning Profiles to be created, while the latter allows an unlimited number. However, it turns out that PP-Std will properly handle an unlimited number of Scanning Profiles. The problem is that it won't let you create them. This is easy to overcome by creating the file containing the Scanning Profiles outside of PP-Std. This article describes two ways to do it.

3
 

Expert Comment

by:mapline
Comment Utility
Hi Joe
Great suggestion 2 comments:
1 My PP 14.5 std stores file in C:\ProgramData\Nuance\PaperPort\14\Profiles.xml (Windows 10)
2. Notepad++ great free app for viewin/editing xml files.
Many thanks
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Michael,
Sorry I'm just replying to your 25-Mar-2016 comment now. I don't recollect seeing it when it first came in and only just now saw it when I received a notification that you endorsed the article today — btw, thanks for that!

> My PP 14.5 std stores file in C:\ProgramData\Nuance\PaperPort\14\Profiles.xml (Windows 10)

You will also find it at C:\Users\All Users\Nuance\PaperPort\14\Profiles.xml in W10. That's because C:\Users\All Users\ points to C:\ProgramData\. In other words, C:\ProgramData\ is the "real" folder and C:\Users\All Users\ is simply a pointer to it — technically known as a junction or symbolic link. So if you look at C:\ProgramData\ and C:\Users\All Users\ in your file manager, they'll show the identical contents, because they are one-and-the-same folder.

> Notepad++ great free app for viewing/editing xml files.

I have Notepad++ installed and agree that it is a great free app, although I use it only for test purposes, since I do all of my text editing with my fav text editor that I've been using forever. But thanks for the tip to our readers! Regards, Joe
0
PaperPort is a popular document management/imaging product from Nuance Communications. It is in widespread use by both individuals and businesses. The current version of PaperPort is 14 (previous version was 12 – Nuance got superstitious and skipped 13). This Article documents how PP14 finally solved a nasty duplex scanning problem that has plagued PaperPort since the introduction of the Blank page is job separator capability in PP10.

The problem is that a blank back side of a page will act as a job separator during a duplex scan. This is extremely bad, since most double-sided documents have some single-sided pages, and they will terminate the document – not what you want! It makes the Blank page is job separator capability practically worthless for users doing duplex scanning. In other words, if you are using a duplex scanner and a page in the stack is not blank on the front, but is blank on the back, this should not be considered as a separator page. In the case of duplex scanning, a page should be blank on both sides in order for it to be treated as a separator page. Otherwise, you'll get what should be a single document broken into separate PaperPort items if that document happens to have some single-sided and some double-sided pages.

This "bug" (Nuance called it a "feature" when I reported it) existed in PP10, PP11, and PP12 (as mentioned above, there was no PP13). Nuance finally fixed it in PP14 with the addition of a new sub-option in the Settings for a
2
 

Expert Comment

by:donjud
Comment Utility
Joe,
I realize this is a bit off topic, but you have so helpful I wanted to ask you first. I am also new to Experts Exchange and wasn't sure if I should start a new Topic.
 The folder we have added to Paper Port is on our server so that it can be accessed and modified by multiple users in our office. The problem we have run in to is that, when someone makes any changes or adds a new document and adds it to the All in One search index, the new document is still not indexed on the other computers so, each user has to re-index the folder in order to find it using the (AIO) search. We have set Paper Port to index every night but, we would like for the document to be indexed on all of the computers as soon as the changes are made.
Is there a setting that would automatically index all incoming or modified documents to the Paper Port Folder or is there a way to run Paper Port on the server? We have spent several days trying to come up with a solution to this problem and haven’t had any luck.
Thanks,
J.D.
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi J.D.,
I was working on a (lengthy!) reply to your same question in the message system, which I just sent (before seeing this). Regards, Joe
0
Update 21-May-2015: I temporarily removed the source code and the code snippets to make major changes to the program. Regards, Joe

INTRODUCTION

This Article is a follow-up to the Article entitled How To Rename-Move a Batch of PDF Files Based on Contents of the Files, recently published here at Experts Exchange.

I considered adding the new feature (splitting a single document into multiple documents) to that Article and program, but concluded that it is a significant enough enhancement to warrant a new Article and program.

PREVIOUS ARTICLE

To understand this Article, it will be helpful to read the previous Article, but to get things going here right away, here's a summary of the previous problem and solution.

There is a large batch of PDF files, all with cryptic names, such as [D123456.PDF]. Inside each file on the first line of the first page (always starting at a fixed column and running to the end of the line) is a human-friendly identifier for the file, such as [John Smith]. The requirement is to loop through all of the files in a specified folder in an automated fashion, changing the file names from, for example,

D123456.PDF

to

D123456 John Smith.PDF

That is, add the identifier from the first line of the first page to the file name.

NEW REQUIREMENT

Following publication of the previous Article and the program that implements the solution, the Original Poster (OP) of the question that prompted the Article
7
 

Expert Comment

by:Member_2_7970298
Comment Utility
how do I obtain a copy of the autohotkey script?
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi New Member,

When I removed the source code last year from six articles that I published here at EE, my intention was that the removal be temporary. I began a project to rewrite all of the programs in my portfolio in order to generalize them for a broader audience and to have a standard user interface, including both a GUI (graphical user interface) and, where it makes sense, a CLI (command line interface). It wound up being a much larger effort than I anticipated, and I'm still not ready to post or distribute the source code for this program (or any of the other five published at EE — and I don't know when or even if that will be, for a variety of reasons).

I have created customized versions of these various programs for EE members who became clients of mine. I provided licenses for the run-time programs (the executables, i.e., the compiled EXE files) for an agreed-upon fee, but I did not provide the source code. I did this previously when EE had the "Hire Me" button, but that no longer exists. The mechanism now at EE for such work is the new Gigs feature, if that interests you.

Regards, Joe
0
Update 21-May-2015: I temporarily removed the source code and the code snippets to make major changes to the program. Regards, Joe

A recent question here at Experts Exchange piqued my interest, so I decided to provide a thorough solution and publish this Article about it. The Original Poster (OP) of the question has approximately one thousand PDF files containing 7-character sequential alphanumeric file names (and, of course, all of the file extensions are PDF). Although the OP did not state this, it is likely that the sequential alphanumerics represent unique identifiers for his customers, perhaps customer numbers. The alphanumeric file name is cryptic, in no way identifiable with the customer, so the OP would like the file name to contain the customer name in addition to the number. For example, a file might be named:

D123456.PDF

The OP would like this file to be renamed:

D123456 John Smith.PDF

The customer name always begins in column 16 on the first line of the first page in the PDF file (and runs to the end of the line). The OP wants an automated way to rename the thousand PDF files, based on the customer name in the contents of each file – in essence, a batch/mass rename. The program documented in this Article (and provided in source code) performs this function.

Two excellent freeware products are needed for this solution – the AutoHotkey scripting language (the program is written in this) and the Xpdf package to convert the PDF …
7
 

Expert Comment

by:Member_2_7970298
Comment Utility
Joe,
Thanks for your response.
I appreciate your comments & issues.
I would greatly appreciate it if you could see your way clear to send me your original AutoHotKey script.
I'm trying to learn more about AutoHotKey scripts and especially how it interfaces with Xpdf's pdftotext.exe
Thanks
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Member_2_7970298 (???),

I received your email at my personal email address, which I'll respond to in a moment. I already responded to your post at the AHK forum, which led you to this article, and then to my Split-Rename-Move article. Instead of three different communication venues (EE, AHK, email), let's continue this discussion via just email.

That said, a quick message about your comments is that the Tutorials forum and the Scripts and Functions forum at the AHK boards are the way to go "to learn more about AutoHotKey scripts" (as well as the Tutorial at the AHK docs site).

There's not much to learn about "how it interfaces with Xpdf's pdftotext.exe" — the RunWait command is it. Here's an actual call from one of my programs:

RunWait,%pdftotextEXE% -f 1 -l 1 -raw "%FullFileNameCurrent%" "%DestinationFolder%%FileNameCurrentTXT%"

Open in new window

I'm sure from the names of the variables you can figure out what that line does. Also, I gave you links at the AHK forum to my two 5-minute EE video Micro Tutorials that should help you with learning about how to use the pdftotext.exe tool:
Xpdf - Command Line Utility for PDF Files
Xpdf - Convert PDF Files to Plain Text Files

If you haven't viewed them yet, I think you'll find them to be a worthwhile expenditure of 10 minutes. Regards, Joe
0
Update 21-May-2015: I temporarily removed the source code and the code snippets to make major changes to the program. Regards, Joe

INTRODUCTION

The inspiration for this Article was a fascinating question here at Experts Exchange on combining TIFF files. Since it is in an area of extreme interest to me (Document Imaging) and since the solution involves two of my all-time favorite freeware products – IrfanView for the TIFF image processing and AutoHotkey for the scripting – I decided to publish the solution as an Article, with a lot more detail put into it than a typical response to a question.

INSTALLATION INSTRUCTIONS

The original poster (OP) of the problem (KHMaddox) said he has no programming experience at all, so I made the solution suitable for such a user. All you have to be capable of doing is download and install the two freeware products, IrfanView and AutoHotkey, and then run the script attached to this Article, as follows:

(1) Install AutoHotkey – http://ahkscript.org (also, see my EE article: AutoHotkey - Getting Started)

Click the Download button at the page above, save the install file, and run it.

(2) Install IrfanView – http://www.irfanview.com/
11
 

Expert Comment

by:Deacon Aspinwall
Comment Utility
Hi Joe,

I can't thank you enough for helping me with your MergeTIFF program. For others reading this and in need of help with a similar problem, Joe isn't ready to release his program publicly here, but I encourage you to contact him directly. Using MergeTIFF was a resounding success, and I certainly would persuade anyone who has the need for such a program to contact you about it.

Joe is really thorough and easy to work with, quick to respond, and can break things down for even technological troglodytes like myself to understand.

Thanks again,
 Deacon
1
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Deacon,
You're very welcome, and my thanks to you for the compliments — I really appreciate hearing them! I'm glad to know that MergeTIFF worked well for you and was a resounding success — music to my ears! Regards, Joe
0
This article is about duplex scanning in Nuance's PaperPort software with a hardware-capable duplex scanner. It is not about the Scan Other Side feature in the Capture Assistant that allows a simplex scanner to achieve double-sided scanning. If you are interested in the latter, see my Experts Exchange article, How to Perform Duplex Scanning with a Simplex Scanner in PaperPort Versions 11, 12, 14. But this article is strictly about how to get duplex scanners to work in PaperPort, more specifically, how to achieve automatic/one-click duplex scanning, that is, duplex scanning with neither the Display scanner dialog box nor the Show Capture Assistant box checked. This article applies to the three most recent versions of PaperPort, i.e., 11, 12, and 14 – yes, Nuance got superstitious and did not release a version 13.

The first step in any scanning issue is to download the latest-and-greatest drivers from your scanner manufacturer's website. My experience over the years is that PaperPort is very sensitive to the TWAIN/WIA drivers, and I've seen many scanning problems fixed by installing the latest drivers.

The problem that I'm addressing in this article is the lack of a Duplex ADF choice in the Source field drop-down in the SET tab of a Scanning Profile. For example, here's what it may look like when the existence of a duplex scanner is not recognized:

Duplex scanner not properly set up
The first approach to fixing this is to run through Advanced setup
6
 

Expert Comment

by:Ed Burwell
Comment Utility
Joe,

This is GREAT!!!  THANKS!

I am using Paperport 11 SE+ and out of nowhere I am getting this DocuCom PDF Trial Watermark on my PDF files.  I've been scanning for years and all of a sudden, there it is?  Have you see this?  Do you know what causes it?  I have a Brother MFP by the way.

Thanks!

Ed
0
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Hi Ed,
First, thanks for joining EE today and reading my article — I appreciate it! I'm sorry to hear that you've run into the dreaded "DocuCom Watermark" glitch. This problem has been in the PaperPort (and PDF Converter) world for a long time. Two complicating factors in your situation are (1) the "SE" editions are non-retail (usually trimmed down) versions that are bundled with scanners (in your case, a Brother MFC), and solutions for the retail products often don't apply to the SE products; and (2) PP11 is very old — more than 10 years! All of that said, here are a few ideas for you:

(1) Read this PaperPort knowledgebase article:
DocuCom PDF Trial watermark appears when opening PDF files

(2) Uninstall and reinstall PP11SE from the media that came with your Brother MFC. Btw, which model MFC is it? Also, what version and bit level of Windows are you using?

(3) If you're not on Version 11.2, try to upgrade to it. The earlier point releases of PP11 — 11.0 and 11.1 — are known to have bugs that were fixed in 11.2. Another one of my EE articles explains how to upgrade to 11.2 at no cost:
PaperPort 11 - Free Upgrade to Version 11.2

However, because you have an SE version, this may not work, but it's worth a try. If it doesn't work, reinstall PP11SE from the media that came with your Brother MFC.

(4) Because it is such a common problem, there has been a lot of discussion about it in the PaperPort community. I participate in a Google Group devoted to PaperPort (and Nuance's other document imaging products), and a search there for "docucom watermark" turns up many hits. Here's the URL for that search:
https://groups.google.com/forum/#!searchin/paperport/docucom$20watermark%7Csort:relevance

You'll need to join the group (free, of course) to get the results (as a non-member, it will say, "You must be signed in as a member of this group to view and participate in it.").

(5) If PaperPort is important to you, buy the retail edition of the latest version, PP14. It is inexpensive these days — just $30 at Amazon for the Standard edition:
https://www.amazon.com/dp/B005CELKLM

That will be version 14.0, but another one of my EE articles explains how to upgrade to 14.5 (the latest release) at no cost:
PaperPort 14 - Free Upgrade to Version 14.5

After doing that, another one of my EE articles explains how to install the Patch 1 update to PP14.5 at no cost:
How to install the Patch 1 update for PaperPort 14.5

PP14.5/Patch1 is the latest-and-greatest, and is the only PaperPort version that is Windows 10-compliant.

If you have any scanning problems after that, I recommend installing the PP14 scanner connection tool. Another one of my EE articles explains how to do that at no cost:
PaperPort 14 Scanner Connection Tool - Fix Scanning Problems in PaperPort 14

Regards, Joe
0
Update 13-December-2014: Article Deprecated. The links in this article for the PaperPort 12 upgrades no longer work and Nuance informed me that the links for the PaperPort 11 ones may soon stop working. However, Nuance provided new links for them, as well as links for the latest version, PaperPort 14 (there was no PaperPort 13). The new links are to direct downloads of the upgrades (PP11.2, PP12.1, PP14.5), rather than to a Download Request Form, as with the previous links. In addition, Nuance provided links to a "Remover Tool" for all three versions. Lastly, there are "Standard versus Professional" feature comparison matrices for all three versions (this article shows the one only for PP12). To deal with such a substantial number of changes, I decided to deprecate this article. I also decided that adding PP14 information to both PP11 and PP12 would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I published three separate articles for PP11, PP12, and PP14 users. You will find them here:

PaperPort 11 - Free Upgrade to Version 11.2
PaperPort 12 - Free Upgrade to Version 12.1
PaperPort 14 - Free Upgrade to Version 14.5
1
 
LVL 54

Author Comment

by:Joe Winograd, EE MVE 2015&2016
Comment Utility
Just a quick comment to point out that shortly after this article was published, Nuance did, indeed, release version 14 of PaperPort, confirming their superstitious behavior of skipping 13 (the latest release of PaperPort is version 14.5). Regards, Joe
0

OCR

516

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Top Experts In
OCR
<
Monthly
>