OCR

547

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Share tech news, updates, or what's on your mind.

Sign up to Post

PaperPort installer detected previous installation
You did a proper uninstallation of PaperPort. You even ran the official PP14 Remover Tool. But when you try to reinstall PaperPort, you get the dialog box above, which you can't get past. There is simply no way to install PaperPort! This article presents a solution that has worked for many PP users.
0
CompTIA Network+
LVL 12
CompTIA Network+

Prepare for the CompTIA Network+ exam by learning how to troubleshoot, configure, and manage both wired and wireless networks.

PaperPort Splash Screen
Sometimes PaperPort will not even open. It displays the splash screen (above) and exits, or it may show an "Application Crash" dialog before exiting (sometimes with a dump, sometimes not). There are many reasons for this problem. This article discusses several of them and offers possible solutions.
0
PaperPort Splash Screen
Sometimes PaperPort will not even open. It displays the splash screen (above) and exits, or it may show an "Application Crash" dialog before exiting. There are many reasons for this, but a recent cause that has reached epidemic levels is due to an issue with Firefox. This article offers a solution.
29

Expert Comment

by:James Gramm
Comment Utility
This worked for me also, PP 14.5 stopped working about a month ago, changed the registry to Chrome and it now opens again.
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi James,
Thanks for joining Experts Exchange today and reading my article. I'm very glad to hear that, after a month of not being able to use PaperPort, it is now opening again — great news! Regards, Joe
0
PaperPort XP Compatibility Mode
Nuance's PaperPort may display this error message: PaperPort appears to be running Windows XP Compatibility Mode which may result in errors. We recommend disabling Compatibility Mode for the PaprPort.exe program, see Technote 6629. This article provides a possible solution to the problem.
4
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Christophe,
You're very welcome...and thanks to you for joining Experts Exchange today, reading my article, and letting me know that it worked for you...I'm glad to hear that! If you take a moment to endorse the article by clicking the thumbs-up icon at the bottom of the article (not the one under this comment), I'll be grateful. Welcome aboard to EE! Regards, Joe
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Thanks for endorsing the article, Christophe — much appreciated! Regards, Joe
0
PaperPort 14.5 Patch 1 update is often not detected or downloaded automatically. This article provides direct download links to solve the problem for retail (non-bundled) versions of the Standard and Professional editions, as well as the Professional edition in Nuance's own OmniPage Ultimate bundle.
21

Expert Comment

by:Rev. Janine Stock
Comment Utility
Thank you so much.  This should be on the Nuance support page just as it is.  You are wonderful.
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
You're welcome, Janine, and thanks to you for joining Experts Exchange today and reading my article. Thanks, too, for the kind words — very nice to hear! Regards, Joe
0
PaperPort is among the most important applications that I run on my Windows computers. I use it every day, for nearly all of my document and photo scanning, as well as most of my document and photo imaging, including OCR via its built-in OmniPage capabilities.

Disclaimer before going further: I have no affiliation with Nuance and no financial interest in it whatsoever. I am simply a happy user/customer.

I've been using PaperPort for around 20 years on every version of Windows since Windows 95. With the Windows 10 release date coming up in two days, I thought it would be worthwhile to document my experience with PaperPort on the Windows 10 Technical Preview, including some tips for successful deployment on W10.

First, my experience with the various builds along the way: I did not install PaperPort on the initial Windows 10 Technical Preview of Build 9841, released on 30-Sep-2014. But I installed on every build after that, from 9860 through the current 10240. The platform is physical hardware, not a virtual machine. It is a relatively old laptop with mediocre specs by today's standards:

Intel Core2 Duo T9300 2.50GHz
4GB RAM DDR2 PC5300
Samsung SSD 840 EVO 250GB (with the read performance firmware upgrade
15

Expert Comment

by:Marco Pols
Comment Utility
@Andrea Great thanks for the register solution of PaperPort. Already contacted support, but so far no answer. But after three days searching solved the problem. Yeah I'm happy

I have version 14.5   (14.5.15168.1450)
BUILDID PP-1313-011-15264.1154

Windows 10 Pro version 1709

Used the downloads from this topic include the patch. After install changed the register.  The button "get latest updates" don't work. I use Firefox as well Chrome (standard browser)

And as you mentioned the best buy I ever did to manage my PDF files as well.
1
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Marco,
Thanks for joining Experts Exchange today and reading my article — Welcome Aboard! As my PP14.5/Patch1 article shows, your BUILDID of PP-1313-011-15264.1154 means that it is PaperPort 14.5 Standard (retail) with Patch1 installed — good news!

> The button "get latest updates" don't work.

Yes, that's the main reason that I wrote the Patch1 article, i.e., because the Common Software Update Manager often does not work. I'm glad that the direct download of Patch1 did work for you. Regards, Joe
0
In a previously published article here at Experts Exchange, I explained how to achieve duplex (double-sided) scanning in Nuance's PaperPort software with a hardware-capable duplex scanner, that is, a scanner which has an Automatic Document Feeder (ADF) capable of scanning both sides of a document. A recent question here at EE prompted me to write this additional article, which explains how to achieve duplex scanning in PaperPort with a simplex scanner, that is, a scanner whose ADF is capable of scanning only the front side of a document.

As with the previous article, this one applies to the three most recent versions of PaperPort, i.e., 11, 12, and 14 — yes, Nuance got superstitious and did not release a version 13.

Here are the steps to achieve duplex scanning in PaperPort (either Standard or Professional) with a simplex scanner:
 
  • Click the Scan Settings button on the Ribbon in PP12 and PP14, or the Scan or Get Photo icon on the toolbar in PP11. You will now have the Scan or Get Photo pane:

Scan-or-Get-Photo.jpg 
  • Select a Scanner and a Scanning Profile.
 
  • Tick the Show Capture Assistant box.
 
  • Place the document in the (simplex) ADF and click the Scan button.
 
  • In PaperPort Standard, you will get this:

front-side-PP-Std.jpg 
  • In PaperPort Professional, you will get this:

front-side-PP-Pro.jpg 
  • Remove the document from the output tray, turn it over so that the last page is on the top, place it in the ADF, and click the Scan Other Side button.
 
  • In PaperPort Standard, you will get this:

after-Scan-Other-Side-PP-Std.jpg 
  • In
2
PaperPort is a popular document imaging/management product from Nuance Communications. It is in widespread use by both individuals and businesses.

The current version of PaperPort is 14. The previous version was 12 (yes, Nuance got superstitious and skipped 13). Both of these most recent versions come in two editions, Professional and Standard. All four products — PP12 Standard, PP12 Professional, PP14 Standard, PP14 Professional — have the ability to create a searchable PDF file without any other software needing to be installed. PP12 was the first release that could do this (and it was carried forward into PP14).

Prior PaperPort releases require Nuance's OmniPage (a separately priced OCR product) to be installed in order to create a searchable PDF file that PaperPort calls a PDF Searchable Image file (because it contains both the raster image and the text created by OCR). The reason that PP12 and PP14 can create a PDF Searchable Image file is that it contains the OmniPage OCR engine under the covers — via the OmniPage Capture Software Development Kit (CSDK).
 
Sidebar on PaperPort Version: If you are running PP12.0, I recommend that you upgrade (free!) to PP12.1. This EE article explains how to do it:
PaperPort 12 - Free Upgrade to Version 12.1
If you are running PP14.0, PP14.1, or PP14.2, I recommend that you upgrade (free!) to PP14.5 (there was not a public release for either 14.3 or 14.4). This EE article explains how to do it:
2

Expert Comment

by:Serg __
Comment Utility
Any ideas how to make the fonts vectorized in the searchable .pdf? I am asking this question because I would not like to install a pirated Adobe Acrobat to convert one pdf book into a pdf book with vectorized fonts. What I got from PaperPort did not meet my expectations. the fonts got blurry. I expected them to get clean and vectorized, to be able to zoom in without those annoying pixels.
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Serg,
Thank you for joining Experts Exchange this week and reading my article.

> Any ideas how to make the fonts vectorized in the searchable .pdf?

I do not have great expertise in font technology and am not aware of any way to control the font settings when PaperPort creates PDF Searchable Image files via the methods discussed in this article.

> I am asking this question because I would not like to install a pirated Adobe Acrobat to convert one pdf book into a pdf book with vectorized fonts.

I find that a strange comment — why would you even consider installing pirated software? We do not condone that here at Experts Exchange and, in fact, the Experts Exchange Terms of Use strictly prohibit any posting related to such activities (under Section 6, Code of Conduct). If you know that Adobe Acrobat will solve your font issue, and it is for only one PDF book, then I recommend purchasing just one month of Adobe Acrobat DC. For around 25 bucks, you'll avoid pirating software ($22.99 for one month of Acrobat Standard DC or $24.99 for one month of Acrobat Pro DC).

> What I got from PaperPort did not meet my expectations. the fonts got blurry.

It's likely that the fonts are blurry only when viewing the image layer. If you view just the text layer, the fonts should be fine. For example, I printed the first page of this article with the PaperPort Image Printer in B&W at 300 DPI to a PDF Image (not PDF Searchable Image). The whole page is attached as a PDF, but here's what it looks like:

font in image
The fonts, indeed, are blurry, because that's a view of the image (in Adobe Acrobat). I then used Nuance's Power PDF to convert to a searchable PDF, but told it not to keep the images. The whole page for that is also attached as a PDF, but here's the same small sample as shown above:

font in non-image
The fonts look great, because that's a view of the text (in Adobe Acrobat), since there is no image layer in the PDF.

> I expected them to get clean and vectorized, to be able to zoom in without those annoying pixels.

The fonts are fine in the text, as shown above. They get pixelated only when viewing the image layer. Another way to observe this is to Copy the text from the PDF Searchable Image file (created by PaperPort via one of the methods explained in this article) and then Paste it into a text-capable product, such as Notepad or Word — the fonts will, of course, appear fine. Regards, Joe
image-only-PaperPort-PDF-Image.pdf
text-only-Power-PDF-searchable-do-no.pdf
0
PaperPort
I. Introduction

In a previous article (now deprecated), I discussed how to upgrade — at no cost for licensed users — Nuance's PaperPort Version 11 (hereafter, PP11) and PaperPort Version 12 (PP12) to the latest "point" releases, namely, 11.2 and 12.1. At the time of that article's publication, PP11 and PP12 were the two latest versions. Now the latest version is PP14 (yes, Nuance was superstitious and skipped 13), and its latest "point" release is 14.5.

I decided that adding PP14 to the previous article would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I decided to create three separate articles for PP11, PP12, and PP14 users. This is the PP14 one.

The earlier point releases of PP14 — 14.0, 14.1, 14.2 (there was not a public release for either 14.3 or 14.4) — are known to have bugs that were fixed in 14.5. This article provides links to 14.5, as well as other useful information on upgrading.

II. Comparison of Standard and Professional Editions

For PP14, there are two consumer editions – Standard and Professional. The feature comparison matrix is available in the Files section of this PaperPort wiki:
http://sites.google.com/site/wikipaperport/files

Here is a direct link to the PDF:
Comparison Matrix of PP14 Standard and PP14 Professional

III. Links to Downloads

The links are to a direct download
15

Expert Comment

by:Candice Buchanan
Comment Utility
Thank you! I'm finally able to open it! Much, much appreciation for the assist!
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
You're welcome, Candice, I'm very glad that you're now able to open PaperPort. If you take a moment to endorse this article by clicking the thumbs-up icon at the end of the article (not the one underneath this comment), I'll appreciate it. Will also be grateful if you endorse the other article, too, by clicking the thumbs-up icon there. Thanks, Joe

Update: Thanks for the endorsements, Candice — much appreciated!
0
I. Introduction

In a previous article (now deprecated), I discussed how to upgrade — at no cost for licensed users — Nuance's PaperPort Version 11 (hereafter, PP11) and PaperPort Version 12 (PP12) to the latest "point" releases, namely, 11.2 and 12.1. At the time of that article's publication, PP11 and PP12 were the two latest versions. Now the latest version is PP14 (yes, Nuance was superstitious and skipped 13), and its latest "point" release is 14.5.

I decided that adding PP14 to the previous article would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I decided to create three separate articles for PP11, PP12, and PP14 users. This is the PP12 one.

The earlier point release of PP12 — 12.0 — is known to have bugs that were fixed in 12.1. The links in the previous article for 12.1 no longer work. This new article provides working links for 12.1, as well as other useful information on upgrading.

II. Comparison of Standard and Professional Editions

For PP12, there are two consumer editions – Standard and Professional. The feature comparison matrix is available in the Files section of this PaperPort wiki:
http://sites.google.com/site/wikipaperport/files

Here is a direct link to the PDF:
Comparison Matrix of PP12 Standard and PP12 Professional

III. New Links to Downloads

The new links are to a direct download
2
Become a Microsoft Certified Solutions Expert
LVL 12
Become a Microsoft Certified Solutions Expert

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

I. Introduction

In a previous article (now deprecated), I discussed how to upgrade — at no cost for licensed users — Nuance's PaperPort Version 11 (hereafter, PP11) and PaperPort Version 12 (PP12) to the latest "point" releases, namely, 11.2 and 12.1. At the time of that article's publication, PP11 and PP12 were the two latest versions. Now the latest version is PP14 (yes, Nuance was superstitious and skipped 13), and its latest "point" release is 14.5.

I decided that adding PP14 to the previous article would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I decided to create three separate articles for PP11, PP12, and PP14 users. This is the PP11 one.

The earlier point releases of PP11 — 11.0 and 11.1 — are known to have bugs that were fixed in 11.2. Although the links in the previous article for 11.2 still work, Nuance informed me that they may soon stop working. This new article provides working links for 11.2 that Nuance says will continue to work after the other ones have been taken down. This article also provides other useful information on upgrading.

II. Comparison of Standard and Professional Editions

For PP11, there are two consumer editions – Standard and Professional. The feature comparison matrix is available in the Files section of this PaperPort wiki:
http://sites.google.com/site/wikipaperport/files

Here is a direct link to the PDF:
2

Expert Comment

by:Becky Hanlon
Comment Utility
The update wants a serial number, which I do not have as I lost my software CD that came with my MFC-7340 Brother printer.  Is there a way to install the update without this?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Becky,
Thank you for joining Experts Exchange today and reading my article — welcome aboard! The PaperPort software that comes with Brother MFCs is an "SE" version, which stands for Special Edition. It is a trimmed-down version that Brother OEMs from Nuance for bundling with their devices and it is not considered to be a commercial/retail edition. This means that updates of the software, such as the 11.2 upgrade discussed in this article, will likely not apply to those bundled SE versions. So even if you had the serial number, it is unlikely that the 11.2 upgrade would work on it.

The other issue is that PP11 is more than 10 years old. As noted in this article, Vista is the latest Windows on which PP11 is supported. My suggestion is to purchase a retail copy of the latest version of PaperPort, which is 14. It is currently $31.61 at Amazon:
https://www.amazon.com/dp/B005CELKLM

That's the standard edition, not Professional, but it's probably more functional than the SE version that was bundled with your Brother MFC. You may want to wait for a better price, as I've seen it at Amazon for less. The download (or disk) is going to be version 14.0, but you may upgrade it for free to version 14.5, because it is a retail version. This comment that I posted at an EE question a couple of months ago explains the upgrade process, referring to several other articles that I've published here at EE:
https://www.experts-exchange.com/questions/29057949/Window-10-version.html#a42302130

Interesting to note that it was just $19.20 at Amazon back then. Once again, welcome to Experts Exchange! Regards, Joe
0
PaperPort
This article discusses the PaperPort 14 Scanner Connection Tool, which Nuance provides at no charge in order to fix scanning problems in Windows 8. Furthermore, users of PaperPort 14 in Windows 7 and Windows 10 have reported that the tool works in those versions of Windows, too.
1

Expert Comment

by:Frank Schabel
Comment Utility
Thanks again for all your help . I tried scanning in Irfanview ,which I already use as my default photo viewer. All the scanner dialogues stay on top unlike paperport. I also use use several database programs from Fnprogramvare, Catvids  & Catraxx which  I frequently scan from and they keep all the scanning windows on top too. So I think it is something in Paperport 14.5 that was changed from Paperport 14 but I am not knowledgeable enough to find it since I know nothing about coding. The support person at Epson told me that there are sometimes settings in programs to keep a program on top (which I knew) but that there was not such a setting in the Epson driver. I cant find one in Paperport if there is one hidden somewhere. He suggested using one of the little programs available that let you select an open window to keep on top. I downloaded Turbotop and it will keep the Scanner dialog on top but doesn't remember that setting so you have reset it every time you run Paperport and the scanner. This all leads me to believe that there is a registry setting somewhere that controls this behavior and maybe some one will find it sometime. Meanwhile I'll just have to reduce the size of my Paperport window and keep it to the right side of my screen so that the scanner dialogue will remain visible on the left.
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Frank,
Since IrfanView and those other apps keep the TWAIN window on top, it does sound like a PaperPort issue, although I've never seen it before. In fact, I have a somewhat opposite problem, i.e., the scanning dialog steals focus on every page from an ADF, making it just about impossible to use the computer while scanning with PaperPort. I submitted the following question to Nuance support a few years ago when I was trying to help a PaperPort user with the problem:
This is one of those long-standing PP issues and I'm wondering if you know of a solution. It doesn't affect me much because I have a separate computer that does scanning with PP14.5. But it drives many users nuts and I'm trying to help one right now. The issue is that when scanning a multi-page document, the PP scanning manager (with both Display scanner dialog box and Show Capture Assistant unchecked) steals focus on every page. It makes it just about impossible to use your computer for any other purpose while PP is scanning a large document. Do you know of a fix or work-around?
The response, which Nuance game me permission to share publicly, was this:
Nuance has determined this behavior cannot be changed in 14 and will be investigated for inclusion in 15.
Back to your problem, I'm not aware of any setting in PaperPort or even its registry entries that controls the "always on top" behavior of scanning drivers. Your idea of reducing the size of the PaperPort window so that the scanning dialog remains visible on the left side sounds like a decent work-around. Let's hope that PaperPort 15 solves both your problem and mine. Regards, Joe
0
Power PDF Advanced
This article explains how to perform batch conversion of PDF, TIFF, and other image file formats into PDF, PDF Searchable, and TIFF files via a command line interface, using Nuance's latest document imaging software — Power PDF Advanced.
7

Expert Comment

by:Chris S
Comment Utility
Hello,

i tried to use the command line but i always get the error message "File open error" although I have write to all directories stated in the command?! It always says "Converting H:\TEST.HTML TO H:\TEST.PDF" which I think means that the Syntax itself is correct but either the Input or Output file cannot be opened?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Chris,
First, thanks for joining Experts Exchange today and reading my article — I appreciate it!

The input file cannot be an HTML file type. The Help output, even in the latest version 2.1 of Power PDF Advanced, still says this:

-I input file full path. * can be used for filename (*.pdf, *.tif)

As I mentioned in the article, I discovered through experimentation that the input file type may also be GIF, JPG, JPEG, PNG, and TIFF. But I just tried HTML in both v2.0 and v2.1, and can confirm that the input file may not be HTML. As you saw yourself, it gives an error message that says "File open error." My advice is to open the HTML file in whatever web browser you prefer and then print it to whatever PDF print driver you prefer. Once you have the PDF file, run Power PDF again, this time using the PDF file as the input instead of the HTML file. Of course, you may not even need to do that if you're happy with the file from the PDF printer. Regards, Joe
0
Update 21-May-2015: I temporarily removed the source code to make major changes to the program. Regards, Joe

INTRODUCTION

This article presents a solution to a question asked here at Experts Exchange. The situation is that there's a large number of subfolders (400 in the original question), each of which has a number of PDF files (two in the original question). The goal is to combine/merge the PDF files in each subfolder (in ascending date order) into a single PDF file, storing the combined file in each subfolder. The source PDF files in each subfolder may have any file names and the user should be able to specify the file name of the combined file.

REQUIRED SOFTWARE

The method presented in this article requires AutoHotkey, an excellent (free!) programming/scripting language. The quick explanation for installing AutoHotkey is to visit its website. A more comprehensive explanation is to read my EE article, AutoHotkey - Getting Started. After installation, AutoHotkey will own the AHK file type, supporting the solution discussed in the remainder of this article.

The program utilizes another excellent (free!) piece of software — PDF Toolkit (PDFtk). It comes in both command line and GUI versions. The command line version is called PDFtk Server
7

Expert Comment

by:Centex Aps
Comment Utility
Hi

Will the "Combine-Merge-PDF-files-20140826.ahk"  file not be attached again?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Centex,
I've decided not to post the full program. I'll be rewriting the article as a "design roadmap" with some crucial code snippets, such as how to call PDFtk Server, but will not be posting the complete source code. Regards, Joe
0
The standard (non-Professional) edition of PaperPort from Nuance Communications (previously known as ScanSoft) is limited to five Scanning Profiles, but in a previous article, I discussed how to overcome this limitation. The technique presented in that article may also be used to address an issue that I've been asked many times by PaperPort users, namely, how to reorder the Scanning Profiles in the Scan or Get Photo pane.

For users with many Scanning Profiles, it is desirable to order the list such that the more frequently used ones are at the top. Unfortunately, PaperPort 12, the previous release, offers no ability to rearrange the order of the Scanning Profiles. PaperPort 14, the current release (yes, Nuance got superstitious and skipped 13), added a little-known, undocumented feature that helps: drag-and-drop of the Scanning Profiles. However, it works poorly, in my opinion, as I find it difficult to drop the profile exactly where I want it.

Also, in both PP12 and PP14, there is no ability to search for a Scanning Profile, so the user must scroll through the list to find the desired profile. This article presents an approach for reordering the list, using the same method presented in my previous article, PaperPort - How To Achieve More Than Five Scanning Profiles in the Standard Edition.

All of the screenshots in this article are from PaperPort Professional 14
1
PaperPort is a popular document imaging/management product from Nuance Communications, previously known as ScanSoft. PaperPort is in widespread use by both individuals and businesses.

The current version of PaperPort is 14. The previous version was 12. Yes, Nuance got superstitious and skipped 13. Both of these most recent versions come in two editions, Professional and Standard, although the Nuance folks do not call it Standard – they simply leave Professional off the name, i.e., PaperPort 12 and PaperPort Professional 12; PaperPort 14 and PaperPort Professional 14. In this article, I refer to them as PP-Std and PP-Pro, and all such references are valid for versions 12 and 14.

There are numerous differences between PP-Std and PP-Pro. The comparison matrices may be seen in the Files section at this PaperPort wiki in these files:

Comparison Matrix of PP12 Standard and PP12 Professional.pdf
Comparison Matrix of PP14 Standard and PP14 Professional.pdf

As shown in the documents above, one of the differences between PP-Std and PP-Pro is that the former allows only five Scanning Profiles to be created, while the latter allows an unlimited number. However, it turns out that PP-Std will properly handle an unlimited number of Scanning Profiles. The problem is that it won't let you create them. This is easy to overcome by creating the file containing the Scanning Profiles outside of PP-Std. This article describes two ways to do it.

3

Expert Comment

by:mapline
Comment Utility
Hi Joe
Great suggestion 2 comments:
1 My PP 14.5 std stores file in C:\ProgramData\Nuance\PaperPort\14\Profiles.xml (Windows 10)
2. Notepad++ great free app for viewin/editing xml files.
Many thanks
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Michael,
Sorry I'm just replying to your 25-Mar-2016 comment now. I don't recollect seeing it when it first came in and only just now saw it when I received a notification that you endorsed the article today — btw, thanks for that!

> My PP 14.5 std stores file in C:\ProgramData\Nuance\PaperPort\14\Profiles.xml (Windows 10)

You will also find it at C:\Users\All Users\Nuance\PaperPort\14\Profiles.xml in W10. That's because C:\Users\All Users\ points to C:\ProgramData\. In other words, C:\ProgramData\ is the "real" folder and C:\Users\All Users\ is simply a pointer to it — technically known as a junction or symbolic link. So if you look at C:\ProgramData\ and C:\Users\All Users\ in your file manager, they'll show the identical contents, because they are one-and-the-same folder.

> Notepad++ great free app for viewing/editing xml files.

I have Notepad++ installed and agree that it is a great free app, although I use it only for test purposes, since I do all of my text editing with my fav text editor that I've been using forever. But thanks for the tip to our readers! Regards, Joe
0
This article is in response to a question here at Experts Exchange. The Original Poster has a scanned signature and wants to make the background transparent so that the signature may be placed on documents without obliterating the surrounding text. Here's an example of the problem, showing how the surrounding document is overlaid when the non-transparent signature is placed on a PDF (in this case, via the Custom Stamp feature in Adobe Acrobat):

Signing PDF in Acrobat with non-transparent stamp
The solution described in this article requires a product called IrfanView, excellent (and free!) imaging software:
http://www.irfanview.com/

At the URL above, click the Download link on the left to download IrfanView and click the PlugIns link on the left to download the PlugIns, which are needed to give you PDF capability. Installing the PlugIns is optional – required only if you want PDF support (and the other features that come with the PlugIns). Install IrfanView first, then install the PlugIns. Although I recommend adding the PlugIns to get PDF support, that's for general, future usage. For this situation, you don't need them, unless your scanned signature is in a PDF file, in which case you do need them.

Here are the steps for making your signature background transparent after installing IrfanView:

(1) Run IrfanView and open the file that has your scanned signature.

(2) I recommend cropping the signature by dragging the mouse from the upper left to the lower right and selecting the Edit menu, then
15

Expert Comment

by:WeThotUWasAToad
Comment Utility
Joe, two questions:

1) What site do you use to download software like IrfanView? The link you provided led to a page with about a dozen download links. The first of those (Download.com) resulted in a pop-up notice saying "iview440_setup.exe is malicious, and Chrome has blocked it". I found one of the sites which did not trigger that notice but still wanted me to accept a bunch of extra stuff. I unchecked all those boxes but it makes me wonder what is downloaded and installed without my knowledge. Is there a site where you can count on getting only what you are after and nothing more?

2) After installing and using the IrfanView app, I closed it. But then when I went back to open it again, it was nowhere to be found in my Start > All Programs list of apps and folders. I was only able to find it by Start > Search but that led to the .exe file and I had to go through the install process all over again. Is there something I could be doing wrong or is that just what's required each time you want to use the software?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
> What site do you use to download software like IrfanView?

Yes, the download page provides many links, but I recommend the TUCOWS link to download IrfanView:
http://www.tucows.com/preview/194967

This will download a single install file called <iviewNNN.exe> or <iviewNNN_setup.exe> with no adware and no junk!

And the TUCOWS link for the PlugIns, which are required for PDF support:
http://www.tucows.com/preview/415586

This will download a single install file called <irfanview_plugins_NNN_setup.exe> with no adware and no junk!

In both cases, NNN is the version number (currently 440, meaning Version 4.40).

Install IrfanView first, then install the PlugIns.

There's also a new 64-bit version (started with the 4.40 release), available here:
http://www.irfanview.com/64bit.htm

The download links for the 64-bit core product and the 64-bit plugins are at the bottom of the page. I have both the 32-bit and 64-bit versions installed on the same W7/64-bit system (in different folders) — no problem.

> Is there something I could be doing wrong or is that just what's required each time you want to use the software?

Perhaps you're not telling it to create shortcuts. Here's what I select in the installer:

IrfanView install shortcuts
That gives me an IrfanView program group with shortcuts, as well as a shortcut on the desktop. I keep an ultra-clean desktop, so I move the shortcut into a folder of shortcuts where I have my most frequently used programs. Regards, Joe
0
PaperPort is a popular document management/imaging product from Nuance Communications. It is in widespread use by both individuals and businesses. The current version of PaperPort is 14 (previous version was 12 – Nuance got superstitious and skipped 13). This Article documents how PP14 finally solved a nasty duplex scanning problem that has plagued PaperPort since the introduction of the Blank page is job separator capability in PP10.

The problem is that a blank back side of a page will act as a job separator during a duplex scan. This is extremely bad, since most double-sided documents have some single-sided pages, and they will terminate the document – not what you want! It makes the Blank page is job separator capability practically worthless for users doing duplex scanning. In other words, if you are using a duplex scanner and a page in the stack is not blank on the front, but is blank on the back, this should not be considered as a separator page. In the case of duplex scanning, a page should be blank on both sides in order for it to be treated as a separator page. Otherwise, you'll get what should be a single document broken into separate PaperPort items if that document happens to have some single-sided and some double-sided pages.

This "bug" (Nuance called it a "feature" when I reported it) existed in PP10, PP11, and PP12 (as mentioned above, there was no PP13). Nuance finally fixed it in PP14 with the addition of a new sub-option in the Settings for a
2

Expert Comment

by:donjud
Comment Utility
Joe,
I realize this is a bit off topic, but you have so helpful I wanted to ask you first. I am also new to Experts Exchange and wasn't sure if I should start a new Topic.
 The folder we have added to Paper Port is on our server so that it can be accessed and modified by multiple users in our office. The problem we have run in to is that, when someone makes any changes or adds a new document and adds it to the All in One search index, the new document is still not indexed on the other computers so, each user has to re-index the folder in order to find it using the (AIO) search. We have set Paper Port to index every night but, we would like for the document to be indexed on all of the computers as soon as the changes are made.
Is there a setting that would automatically index all incoming or modified documents to the Paper Port Folder or is there a way to run Paper Port on the server? We have spent several days trying to come up with a solution to this problem and haven’t had any luck.
Thanks,
J.D.
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi J.D.,
I was working on a (lengthy!) reply to your same question in the message system, which I just sent (before seeing this). Regards, Joe
0
Update 21-May-2015: I temporarily removed the source code and the code snippets to make major changes to the program. Regards, Joe

INTRODUCTION

This Article is a follow-up to the Article entitled How To Rename-Move a Batch of PDF Files Based on Contents of the Files, recently published here at Experts Exchange.

I considered adding the new feature (splitting a single document into multiple documents) to that Article and program, but concluded that it is a significant enough enhancement to warrant a new Article and program.

PREVIOUS ARTICLE

To understand this Article, it will be helpful to read the previous Article, but to get things going here right away, here's a summary of the previous problem and solution.

There is a large batch of PDF files, all with cryptic names, such as [D123456.PDF]. Inside each file on the first line of the first page (always starting at a fixed column and running to the end of the line) is a human-friendly identifier for the file, such as [John Smith]. The requirement is to loop through all of the files in a specified folder in an automated fashion, changing the file names from, for example,

D123456.PDF

to

D123456 John Smith.PDF

That is, add the identifier from the first line of the first page to the file name.

NEW REQUIREMENT

Following publication of the previous Article and the program that implements the solution, the Original Poster (OP) of the question that prompted the Article
7

Expert Comment

by:Member_2_7970298
Comment Utility
how do I obtain a copy of the autohotkey script?
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi New Member,

When I removed the source code last year from six articles that I published here at EE, my intention was that the removal be temporary. I began a project to rewrite all of the programs in my portfolio in order to generalize them for a broader audience and to have a standard user interface, including both a GUI (graphical user interface) and, where it makes sense, a CLI (command line interface). It wound up being a much larger effort than I anticipated, and I'm still not ready to post or distribute the source code for this program (or any of the other five published at EE — and I don't know when or even if that will be, for a variety of reasons).

I have created customized versions of these various programs for EE members who became clients of mine. I provided licenses for the run-time programs (the executables, i.e., the compiled EXE files) for an agreed-upon fee, but I did not provide the source code. I did this previously when EE had the "Hire Me" button, but that no longer exists. The mechanism now at EE for such work is the new Gigs feature, if that interests you.

Regards, Joe
0
Bootstrap 4: Exploring New Features
LVL 12
Bootstrap 4: Exploring New Features

Learn how to use and navigate the new features included in Bootstrap 4, the most popular HTML, CSS, and JavaScript framework for developing responsive, mobile-first websites.

Update 21-May-2015: I temporarily removed the source code and the code snippets to make major changes to the program. Regards, Joe

A recent question here at Experts Exchange piqued my interest, so I decided to provide a thorough solution and publish this Article about it. The Original Poster (OP) of the question has approximately one thousand PDF files containing 7-character sequential alphanumeric file names (and, of course, all of the file extensions are PDF). Although the OP did not state this, it is likely that the sequential alphanumerics represent unique identifiers for his customers, perhaps customer numbers. The alphanumeric file name is cryptic, in no way identifiable with the customer, so the OP would like the file name to contain the customer name in addition to the number. For example, a file might be named:

D123456.PDF

The OP would like this file to be renamed:

D123456 John Smith.PDF

The customer name always begins in column 16 on the first line of the first page in the PDF file (and runs to the end of the line). The OP wants an automated way to rename the thousand PDF files, based on the customer name in the contents of each file – in essence, a batch/mass rename. The program documented in this Article (and provided in source code) performs this function.

Two excellent freeware products are needed for this solution – the AutoHotkey scripting language (the program is written in this) and the Xpdf package to convert the PDF …
7

Expert Comment

by:Member_2_7970298
Comment Utility
Joe,
Thanks for your response.
I appreciate your comments & issues.
I would greatly appreciate it if you could see your way clear to send me your original AutoHotKey script.
I'm trying to learn more about AutoHotKey scripts and especially how it interfaces with Xpdf's pdftotext.exe
Thanks
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Member_2_7970298 (???),

I received your email at my personal email address, which I'll respond to in a moment. I already responded to your post at the AHK forum, which led you to this article, and then to my Split-Rename-Move article. Instead of three different communication venues (EE, AHK, email), let's continue this discussion via just email.

That said, a quick message about your comments is that the Tutorials forum and the Scripts and Functions forum at the AHK boards are the way to go "to learn more about AutoHotKey scripts" (as well as the Tutorial at the AHK docs site).

There's not much to learn about "how it interfaces with Xpdf's pdftotext.exe" — the RunWait command is it. Here's an actual call from one of my programs:

RunWait,%pdftotextEXE% -f 1 -l 1 -raw "%FullFileNameCurrent%" "%DestinationFolder%%FileNameCurrentTXT%"

Open in new window

I'm sure from the names of the variables you can figure out what that line does. Also, I gave you links at the AHK forum to my two 5-minute EE video Micro Tutorials that should help you with learning about how to use the pdftotext.exe tool:
Xpdf - Command Line Utility for PDF Files
Xpdf - Convert PDF Files to Plain Text Files

If you haven't viewed them yet, I think you'll find them to be a worthwhile expenditure of 10 minutes. Regards, Joe
0
Update 21-May-2015: I temporarily removed the source code and the code snippets to make major changes to the program. Regards, Joe

INTRODUCTION

The inspiration for this Article was a fascinating question here at Experts Exchange on combining TIFF files. Since it is in an area of extreme interest to me (Document Imaging) and since the solution involves two of my all-time favorite freeware products – IrfanView for the TIFF image processing and AutoHotkey for the scripting – I decided to publish the solution as an Article, with a lot more detail put into it than a typical response to a question.

INSTALLATION INSTRUCTIONS

The original poster (OP) of the problem (KHMaddox) said he has no programming experience at all, so I made the solution suitable for such a user. All you have to be capable of doing is download and install the two freeware products, IrfanView and AutoHotkey, and then run the script attached to this Article, as follows:

(1) Install AutoHotkey – http://ahkscript.org (also, see my EE article: AutoHotkey - Getting Started)

Click the Download button at the page above, save the install file, and run it.

(2) Install IrfanView – http://www.irfanview.com/
12

Expert Comment

by:Brandon G
Comment Utility
Hello Joe,
This sounds like a very magical solution and exactly the type of resource needed to process the nearly 1Million individual tif image files that I have, which are stored similarly as your article describes i.e. "tdr-2772-134446_page_1.tif","tdr-2772-134446_page_2.tif","tdr-2772-134446_page_3.tif", and "tdr-2772-134446_page_4.tif". All the files are in a single location and I am looking to merge the multiple pages into single multi-page tif, just as you describe.

Anyway you can be of assistance or I can be a part of the Beta testing?

Any assistance is much appreciated. Thank you in advance.
Brandon
0
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Brandon,
Thanks for joining Experts Exchange today, reading my article, and endorsing it. You're right — MergeTIFF does exactly what you want. For example, by specifying 15 as the number of first characters that need to match, it will merge these files...

tdr-2772-134446_page_1.tif
tdr-2772-134446_page_2.tif
tdr-2772-134446_page_3.tif
tdr-2772-134446_page_4.tif

...into this file:

tdr-2772-134446.tif

I'll write you a PM in the EE Message System to discuss this further, as I've done with many EE members for this program and several other programs. I should point out that this is now an acceptable method, compliant with EE's Terms of Use, since EE removed the Gigs product and the Hire Me feature. Regards, Joe
1
This article is about duplex scanning in Nuance's PaperPort software with a hardware-capable duplex scanner. It is not about the Scan Other Side feature in the Capture Assistant that allows a simplex scanner to achieve double-sided scanning. If you are interested in the latter, see my Experts Exchange article, How to Perform Duplex Scanning with a Simplex Scanner in PaperPort Versions 11, 12, 14. But this article is strictly about how to get duplex scanners to work in PaperPort, more specifically, how to achieve automatic/one-click duplex scanning, that is, duplex scanning with neither the Display scanner dialog box nor the Show Capture Assistant box checked. This article applies to the three most recent versions of PaperPort, i.e., 11, 12, and 14 – yes, Nuance got superstitious and did not release a version 13.

The first step in any scanning issue is to download the latest-and-greatest drivers from your scanner manufacturer's website. My experience over the years is that PaperPort is very sensitive to the TWAIN/WIA drivers, and I've seen many scanning problems fixed by installing the latest drivers.

The problem that I'm addressing in this article is the lack of a Duplex ADF choice in the Source field drop-down in the SET tab of a Scanning Profile. For example, here's what it may look like when the existence of a duplex scanner is not recognized:

Duplex scanner not properly set up
The first approach to fixing this is to run through Advanced setup
8
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
You're very welcome, Cori, and thanks back at you for joining Experts Exchange today and reading my article. I'm glad to hear that you now have duplex scanning working in PaperPort with your Brother PDS-5000 duplex scanner. If you take a moment to endorse the article by clicking the thumbs-up icon at the end of it (which currently says 6). I'll appreciate it. Welcome to Experts Exchange! Regards, Joe
1
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Cori,
Thanks for endorsing the article — much appreciated! Cheers, Joe
0
Update 13-December-2014: Article Deprecated. The links in this article for the PaperPort 12 upgrades no longer work and Nuance informed me that the links for the PaperPort 11 ones may soon stop working. However, Nuance provided new links for them, as well as links for the latest version, PaperPort 14 (there was no PaperPort 13). The new links are to direct downloads of the upgrades (PP11.2, PP12.1, PP14.5), rather than to a Download Request Form, as with the previous links. In addition, Nuance provided links to a "Remover Tool" for all three versions. Lastly, there are "Standard versus Professional" feature comparison matrices for all three versions (this article shows the one only for PP12). To deal with such a substantial number of changes, I decided to deprecate this article. I also decided that adding PP14 information to both PP11 and PP12 would result in a long, unwieldy article. In addition, a user of one version is not going to be concerned about the other two versions, so I published three separate articles for PP11, PP12, and PP14 users. You will find them here:

PaperPort 11 - Free Upgrade to Version 11.2
PaperPort 12 - Free Upgrade to Version 12.1
PaperPort 14 - Free Upgrade to Version 14.5
1
LVL 61

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Just a quick comment to point out that shortly after this article was published, Nuance did, indeed, release version 14 of PaperPort, confirming their superstitious behavior of skipping 13 (the latest release of PaperPort is version 14.5). Regards, Joe
0

OCR

547

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Top Experts In
OCR
<
Monthly
>