Document Imaging

Document imaging is an information technology category for systems capable of replicating documents commonly used in business. Document imaging systems can take many forms including microfilm, on demand printers, facsimile machines, copiers, multifunction printers, document scanners, computer output microfilm (COM) and archive writers. Document imaging is a form of enterprise content management, built around the need to manage and secure the escalating volume of electronic documents (spreadsheets, word-processing documents, PDFs, e-mails) created in organizations.

Share tech news, updates, or what's on your mind.

Sign up to Post

Xpdf - PDFtoPS - Command Line Utility to Convert a PDF File to PS (PostScript)
In this tenth video of my Xpdf series, I discuss and demonstrate the PDFtoPS utility, which converts a PDF file to PostScript (PS). Also, it provides an option allowing creation of an Encapsulated PostScript (EPS) file. It performs its functions via a command line interface, making it suitable for use in programs, scripts, batch files — any place where a command line call can be made.

1. Download the software


You may have already downloaded the Xpdf tools while watching one of my earlier videos in the series, but there has since been an upgrade from Version 3 to Version 4 and there is a new download site:

https://www.xpdfreader.com/download.html

Visit that site and download the pre-compiled Windows binary ZIP archive, then unzip it.

Step1

2. Locate the documentation folder for the Xpdf utilities


Go to the folder where you unzipped the downloaded ZIP file and find the doc folder.

Step2

3. Read the documentation for the PDFtoPS tool


Go into the doc folder and find the plain text file called pdftops.txt.

Open it with any text editor, such as Notepad, and read it. This is the documentation for the PDFtoPS tool.

Step3

4. Set up a test folder


Create a test folder.

Copy pdftops.exe from the unzipped bin32 folder into your test folder.

Copy a sample PDF file into your test folder.

Step4

5. Set up a command prompt for testing


Open a command prompt window.

Navigate to your test folder.

Issue a DIR command in the command prompt to be sure that only two files are in it - the PDFtoPS executable and the sample PDF file.

Step5

6. Run the PDFtoPS utility to create the PostScript file


Issue the following command in the command prompt:

pdftops TestFileName.pdf
1
LVL 25

Expert Comment

by:Andrew Leniart
Great video and introduction to a very useful tool indeed.
0
LVL 63

Author Comment

by:Joe Winograd, Fellow&MVE
Thank you, Andrew, I appreciate the compliment and the endorsement. Happy New Year! Regards, Joe
0
Exploring SharePoint 2016
LVL 12
Exploring SharePoint 2016

Explore SharePoint 2016, the web-based, collaborative platform that integrates with Microsoft Office to provide intranets, secure document management, and collaboration so you can develop your online and offline capabilities.

Xpdf - PDFtoPPM - Command Line Utility to Convert a PDF File to PPM, PGM, PBM
In this ninth video of my Xpdf series, I discuss and demonstrate the PDFtoPPM tool, which converts a PDF file to color portable pixmap (PPM) format, grayscale portable graymap (PGM) format, or monochrome (black & white) portable bitmap (PBM) format. It creates a separate image file for each page of the PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any place where a command line call can be made.

1. Download the software


You may have already downloaded the Xpdf tools while watching one of my earlier videos in the series, but there has since been an upgrade from Version 3 to Version 4 and there is a new download site:

https://www.xpdfreader.com/download.html

Visit that site and download the pre-compiled Windows binary ZIP archive, then unzip it.

Step1

2. Locate the documentation folder for the Xpdf utilities


Go to the folder where you unzipped the downloaded ZIP file and find the doc folder.

Step2

3. Read the documentation for the PDFtoPPM tool


Go into the doc folder and find the plain text file called pdftoppm.txt.

Open it with any text editor, such as Notepad, and read it. This is the documentation for the PDFtoPPM tool.

Step3

4. Set up a test folder


Create a test folder.

Copy pdftoppm.exe from the unzipped bin32 folder into your test folder.

Copy a sample PDF file into your test folder, preferably one with numerous pages.

Step4

5. Set up a command prompt for testing


Open a command prompt window.

Navigate to your test folder.

Issue a DIR command in the command prompt to be sure that only two files are in it - the PDFtoPPM executable and the sample PDF file.

Step5
1
Xpdf - PDFtoHTML - Command Line Utility to Convert a PDF File to HTML
In this eighth video of my Xpdf series, I discuss and demonstrate the PDFtoHTML utility, which, exactly as its name says, converts a PDF file to HTML. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any place where a command line call can be made.

1. Download the software


You may have already downloaded the Xpdf tools while watching one of my earlier videos in the series, but there has since been an upgrade from Version 3 to Version 4 and there is a new download site:

https://www.xpdfreader.com/download.html

Visit that site and download the pre-compiled Windows binary ZIP archive, then unzip it.

Step1

2. Locate the documentation folder for the Xpdf utilities


Go to the folder where you unzipped the downloaded ZIP file and find the doc folder.

Step2

3. Read the documentation for the PDFtoHTML tool


Go into the doc folder and find the pdftohtml.txt file.

It is a plain text file. Open it with any text editor, such as Notepad, and read it. This is the documentation for the PDFtoHTML tool.

Step3

4. Set up a test folder


Create a test folder.

Copy pdftohtml.exe from the unzipped bin32 folder into your test folder.

Copy a sample PDF file into your test folder, preferably one with numerous pages.

Step4

5. Set up a command prompt for testing


Open a command prompt window.

Navigate to your test folder.

Issue a DIR command in the command prompt to be sure that only two files are in it - the PDFtoHTML executable and the sample PDF file.

Step5

6. Run the PDFtoHTML utility


Issue the following command in the command prompt:

pdftohtml TestFileName.pdf HTMLfolder
2
xpdfrc - Configuration File for All Xpdf Utilities
This is the eleventh — and final — video of my Experts Exchange Micro Tutorials on the Xpdf utilities. The first video is an overview of the command line tools. The next nine videos are tutorials on all them:

PDFimages - Extract Images from PDF Files
PDFtoText - Convert PDF Files to Plain Text Files
PDFinfo - Retrieve Page Count and Other Information from PDF Files
PDFdetach - Detach Attachments from PDF Files
PDFtoPNG - Convert a Multi-page PDF File into Separate PNG Files
PDFfonts - List Fonts Used in a PDF File
PDFtoHTML - Convert a PDF File to HTML
PDFtoPPM - Convert a PDF File to PPM, PGM, PBM
PDFtoPS - Convert a PDF File to PS (PostScript)

This last video in the series discusses xpdfrc, which is the single configuration file that Xpdf uses for all nine utilities. It provides an enormous number of options, allowing extensive control of the tools, such as character mapping, font configuration, PostScript control, rasterizer settings, text control, and much more.

1. Download the software and fonts


You may have already downloaded the Xpdf tools while watching one of my earlier videos in the series, but there has since been an upgrade from Version 3 to Version 4 and there is a new download site:

https://www.xpdfreader.com/download.html

Visit that site and download the pre-compiled Windows binary ZIP archive, then unzip it.

Download the Symbol and Zapf Dingbats fonts from the same page.

Step1

2. Locate the documentation folder for the Xpdf utilities


Go to the folder where you unzipped the downloaded ZIP file and find the doc folder.

Step2
1
I have an expense form created in Docuware.  On the form, I have 10 line items [expenses] each with a total amount.  The bottom of the form has a report total however, there is no way in the form designer to create a total field.  Is there a way [even on the backend] to total the line items and present them in an identified field.
0
Image Magick C# Library throwing exception

I suspect this is an easy one for you to help me solve.

I am trying to run a Visual Studio project that works on my friend's Windows PC, but is throwing a path/library exception on my Windows Visual Studio Community 2015,where Windows is running on my Mac via Parallels.

I verify the file exists, but then I get the following exception...

Message = "PDFDelegateFailed `The system cannot find the file specified.\r\n' @ error/pdf.c/ReadPDFImage/793"

Exception
and here is the code that throws it:

Code that throws exception
0
How to reduce the file size of a PDF
0
Microsoft Office Picture Manager
Microsoft Office Picture Manager was included in Office 2003, 2007, and 2010, but not in 2013 or 2016. Now that Office 2019 is here, the bad news is that it is still missing, but the good news is that the same no-cost method that works to install it with Office 2013 and 2016 also works with 2019.
1
How to put a date-time stamp on a PDF file with free software - Foxit Reader
I previously published an Experts Exchange video Micro Tutorial that describes how to scan documents to a PDF file using an excellent, free product called Foxit Reader:

How to scan to a PDF file with free software

N.B.: As with any "free" software, there may be restrictions, which are always specified in the software's licensing agreement, typically known as the End-User License Agreement (EULA). I encourage you to read the entire EULA of this product to be certain that you are in license compliance.

This new video Micro Tutorial shows where to download the free Foxit Reader and explains how to use it to place a date-time stamp on a PDF file.

1. Download and Install the Free Version of Foxit Reader


Visit the website for Foxit Reader at Foxit Software:

https://www.foxitsoftware.com/pdf-reader/

Select the language and O.S. from the drop-downs, then click the big, red Download button:

After downloading, run the installer.

step1

2. Run Foxit Reader


The installer creates a Foxit Reader program group with a shortcut to the Foxit Reader program.

Click the shortcut to run Foxit Reader.

step2

3. Put a pre-defined date-time stamp on the PDF


After opening a PDF file, click the Comment menu.

Click the drop-down on the Stamp button.

Click one of the five pre-defined Dynamic Stamps, all of which have a date-time stamp.

Position the mouse wherever you want the stamp and click to place it.

step3

4. Create a custom date-time stamp


Click the drop-down on the Create button.

Click Create Custom Dynamic Stamp.

Select a Stamp Template and fill in the options in the dialog box.

Click Add.

Click OK.

step4
2
LVL 25

Expert Comment

by:Andrew Leniart
Another great "Winograd Micro Tutorial" :)

Good stuff Joe, should be highly useful to point askers to.

Endorsed!
0
LVL 63

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Andrew,
Thank you for the kind words and the endorsement — I really appreciate both! Regards, Joe
0
In Paperport 14.5 I drag a file to the Outlook icon in the sendto bar. It then goes to work but ends with the message: 'The command line argument is not valid verify the switch'you are using'. So no email is created. I am on Outlook 2010 (on W10). Any ideas?
0
Python 3 Fundamentals
LVL 12
Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

I have used Paperport 14.5 for many years and also in W10. I cannot tell since when exactly, but I can not get it running anymore. A splash screen appears for a few seconds and then disappears. Thats it. I have applied the patch and I have tried to adjust the compatibility, but all to no avail. I have read the solutions in an earlier question (*) on this, but that user had just upgraded whereas I had it working on W10.
Then I downloaded the 633 mb professional update. Installing it gave me a few messages about a discussion on the serial, but even after that finished the program wouldnt start. I am now going to restart the computer and see if that helps. Then apply the patch again

(*) https://www.experts-exchange.com/questions/28955421/I-have-PaperPort-14-Pro-just-upgraded-to-W10-Paperport-won't-run-followed-your-link-to-install-14-5-merely-reinstalled-14-how-do-I-get-Paperport-to-run-on-Windows-10.html
0
Twain compatible home office MFPs.  We are testing the Paperport software with Brother MFC-9340CDW.  The reason for this was the Canon MX series scanner software was acting a bit wonky over the years and the printer is five years old. We actually really like the MP Navigator software but as discussed in our previous thread, MP Navigator is no longer being made for the later MX series.  Thus, we were turned on to Paperport document management software.  It came with the Brother MFC-9340CDW.  Though, I don't think it works with network shares unless I upgrade but I digress.  Anyway, my question is...  Could I have avoided maybe purchasing new hardware and just bought better document management software?  The reason I ask is Paperport prompted me to connect with the Canon hardware using Twain.  Did I just assume most 100 or $200 MFP weren't TWAIN compatible?  A google query of Canon MX twain shows nothing of the sort.  The better question is how common is Twain compatibility (is there a list?) so I can push Document Management software on to some of my other clients?  They too have horrible document management software.
0
Need reasonably priced Home Office Twain compatible scanning documents software.  MP Navigator suite me well for years but they no longer distribute it with the Canon MX series MFP.  What's a really intuitive scanning software that initiates the scan for ADF or flatbed, makes simple adjustments for single or double sides series of documents, and gives you options for a couple standard formats like PDF or tif, etc?  Is there a really intuitive and elegantly done software that does all that plus organizes your scan documents well?
0
How to scan to a PDF file with free software - Foxit Reader
I've published three five-minute Experts Exchange video Micro Tutorials that describe terrific features in an excellent, free PDF product called PDF-XChange Editor:

How to rotate pages in a PDF with free software
How to OCR pages in a PDF with free software
How to password-protect a PDF with free software

PDF-XChange Editor has many other features in its free version, but, unfortunately, it cannot do scanning — you must purchase one of its non-free versions to get scanning functionality. Fortunately, there's another excellent, free PDF product that can perform scanning — Foxit Reader. However, the free Foxit Reader cannot do OCR, so you'll want to keep the free PDF-XChange Editor for its OCR capability, and add Foxit Reader for its scanning capability. The combination of the two products will allow you to create searchable PDFs (aka PDF Searchable Image files) with your scanner, utilizing free software.

N.B.: As with any "free" software, there may be restrictions, which are always specified in the software's licensing agreement, typically known as the End-User License Agreement (EULA). I encourage you to read the entire EULA of these products to be certain that you are in license compliance.

In order to scan, Foxit Reader requires an …
3

Expert Comment

by:Basem Khawaja
Joe if I say you are a genius. It would be an understatement. God bless you my friend:)
0
LVL 63

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Basem,
Thank you for the kind words and the video endorsement...both very much appreciated! Regards, Joe
0
Hello All,

Does anyone know of a software that would achieve the below?

Software that will allow me to have users via my website upload pictures and save these pictures locally inside my network? The idea is for certain users to upload a picture, get prompted to name that picture and save it to either my system or its own system... Then I will build into my internal applications a way to grab that picture when querrying the name of that picture... Sort of like a document imaging system I guess but with actual pictures and a web interface for the user to upload the picture rather than an internal scanner.

much appreciated on the responses.
0
Baby steps with PDFtoText for OCR

What steps are the first for me to take as I create a proof of concept that will be:

- a C# Winforms program
- uses the PDFtoText library for OCR

Are there any demo programs I can review? Should I just dive in?

Thanks
0
PaperPort installer detected previous installation
You did a proper uninstallation of PaperPort. You even ran the official PP14 Remover Tool. But when you try to reinstall PaperPort, you get the dialog box above, which you can't get past. There is simply no way to install PaperPort! This article presents a solution that has worked for many PP users.
0
Hello Randal,

Thanks for the email, however I don’t think I have joined your service.  An article by your Joe Winograd came up when I was searching for how to fix an issue with my scanner & Paperport 14.  Unfortunately, having followed his various suggestions I am no further along, other than now I have uninstalled the old Paperport so may be royally in a pickle.

I am not an IT professional so not eligible for membership of your community.  I am just someone who uses a PC and wishes we could go back to the olden days when you installed software from discs and the PC just happily went along until you decided to do something to it.  Nowadays I feel like I would be justified sending a monthly invoice to Microsoft for all the work they make me do trying to figure out how to sort the issues caused by their latest ‘helpful’ update.

Please give my best to Mr Winograd.   In case he’s still interested in Windows 10 and Paperport, here’s what happens when I try to installed 14.5 from the link provided in his article:

This runs:

 

Then I’m asked to select my language and this pops up (normally in English – must have accidentally clicked Deutsche this time!):

 

I’ve done a Windows uninstalled and the special Nuance uninstall, rebooted between each stage and still the same as above.

I did also download the patch (Patch 1) which he suggested but guess there’s not a lot I can do with the patch if I can’t get the .exe to run.

I am now going away to have a …
4
PaperPort Splash Screen
Sometimes PaperPort will not even open. It displays the splash screen (above) and exits, or it may show an "Application Crash" dialog before exiting (sometimes with a dump, sometimes not). There are many reasons for this problem. This article discusses several of them and offers possible solutions.
0
Become a Certified Penetration Testing Engineer
LVL 12
Become a Certified Penetration Testing Engineer

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

Hi,
Iam looking for a solution using imagemagic to combine multiple tiff files to to single tiff file in the DIR, which has sub folders.

Let me explain

There are multiple sub folders which contains tiff images, lets take in a particular folder there could be 2 or more images which needs to be combined as one image.
I tried the basic concept and it works

convert 0003_404.tif 0030.tif 1_merged.tif   ( Here i have changed the dir and run the comand which combines two tiff images to a single tiff

i was also trying to use the below , not sure how far good.

for %f in (%cd% D:\subfolders\) do (convert 0007.tif 0011_623.tif +append 1_merged.tif)
0
PaperPort Splash Screen
Sometimes PaperPort will not even open. It displays the splash screen (above) and exits, or it may show an "Application Crash" dialog before exiting. There are many reasons for this, but a recent cause that has reached epidemic levels is due to an issue with Firefox. This article offers a solution.
33

Expert Comment

by:lenritz
Comment Utility
I had the problem with the splash screen appearing, and then exiting.  I edited the registry to change it from firefox to explorer. Voila!  Opened fine.  closed PP, opened it again, all good.  That was yesterday.   Today though, PP again would exit after the splash screen.  When I checked the registry, i see that it had changed back to firefox.  So I edited the registry, again changing it to explorer.  And PP now again opens without trouble.   So, am I going to have to edit the registry every time?  Or is there a way to make the edits "stick"?  I am using Firefox 63.0.1
0
LVL 63

Author Comment

by:Joe Winograd, Fellow&MVE
Comment Utility
Hi Len,
I haven't had to fix it again since I first fixed it. I recently upgraded to Firefox 63.0.1 (64-bit) and PaperPort still opened fine. I did not even look at the registry entry since PaperPort was still working. But I just looked at the registry to be certain and its value is still "xxx" (which is what I set it to). So, I don't know why yours would keep changing...I don't have a clue what is changing it. If you can't figure it out, I would put the .REG file on your desktop or in your Start menu or pin it to the taskbar so that you have very fast access to it and can fix the problem quickly with a single or double mouse click. I'd like to provide you with better help, but I don't know why it's happening. Regards, Joe

P.S. Thanks for endorsing the article...much appreciated!
0
PaperPort loses connection to scanning device.
0
0
Standalone open source or commerical software which uses Google OCR to be used.

Assume i bought and have the valid Google Vision API credentials and would like to know does any standalone open source or commercial client is available which is already integrated with Google Vision API which has other features as well.

Basically want to convert image to text....bulk conversion etc. via an application,

Thanks.
0
I have a scanner script that I got from here:

http://beaukey.blogspot.com/2015/09/document-scan-with-vbsscript-and-wia.html

OPTION EXPLICIT

'--- dim objects...
dim wsh, fso, objWIAdialog, objImage, imgFilename
set wsh=CreateObject("wscript.shell")
set fso=CreateObject("scripting.filesystemobject")
set objWIAdialog = CreateObject("WIA.CommonDialog") 

'--- Start the Scanner dialog box, where a scanner can be selected...
set objImage = objWIADialog.ShowAcquireImage

'--- Save and show the scan if the scan was successful...
If Not objImage Is Nothing Then 
    Randomize
    imgFilename=fso.GetSpecialFolder(2) & "\Scan2BMP" &  Int((999999 - 100000 + 1) * Rnd + 100000) & ".jpg" 
    wscript.echo "Scan stored as BMP in file: " & imgFilename
    objImage.SaveFile imgFilename 
    wsh.run(imgFilename)
End if 

Open in new window


It works fine. I would like to tweak it so after it scans, it opens Adobe (I have the full version on Adode CS) instead of windows Image Viewer. I also don’t want to save the file, just have it scan and open in Adobe – is that possible?
0

Document Imaging

Document imaging is an information technology category for systems capable of replicating documents commonly used in business. Document imaging systems can take many forms including microfilm, on demand printers, facsimile machines, copiers, multifunction printers, document scanners, computer output microfilm (COM) and archive writers. Document imaging is a form of enterprise content management, built around the need to manage and secure the escalating volume of electronic documents (spreadsheets, word-processing documents, PDFs, e-mails) created in organizations.

Top Experts In
Document Imaging
<
Monthly
>