[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x

OCR

554

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Share tech news, updates, or what's on your mind.

Sign up to Post

How to put a date-time stamp on a PDF file with free software - Foxit Reader
I previously published an Experts Exchange video Micro Tutorial that describes how to scan documents to a PDF file using an excellent, free product called Foxit Reader:

How to scan to a PDF file with free software

N.B.: As with any "free" software, there may be restrictions, which are always specified in the software's licensing agreement, typically known as the End-User License Agreement (EULA). I encourage you to read the entire EULA of this product to be certain that you are in license compliance.

This new video Micro Tutorial shows where to download the free Foxit Reader and explains how to use it to place a date-time stamp on a PDF file.

1. Download and Install the Free Version of Foxit Reader


Visit the website for Foxit Reader at Foxit Software:

https://www.foxitsoftware.com/pdf-reader/

Select the language and O.S. from the drop-downs, then click the big, red Download button:

After downloading, run the installer.

step1

2. Run Foxit Reader


The installer creates a Foxit Reader program group with a shortcut to the Foxit Reader program.

Click the shortcut to run Foxit Reader.

step2

3. Put a pre-defined date-time stamp on the PDF


After opening a PDF file, click the Comment menu.

Click the drop-down on the Stamp button.

Click one of the five pre-defined Dynamic Stamps, all of which have a date-time stamp.

Position the mouse wherever you want the stamp and click to place it.

step3

4. Create a custom date-time stamp


Click the drop-down on the Create button.

Click Create Custom Dynamic Stamp.

Select a Stamp Template and fill in the options in the dialog box.

Click Add.

Click OK.

step4
2
LVL 24

Expert Comment

by:Andrew Leniart
Another great "Winograd Micro Tutorial" :)

Good stuff Joe, should be highly useful to point askers to.

Endorsed!
0
LVL 62

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Andrew,
Thank you for the kind words and the endorsement — I really appreciate both! Regards, Joe
0
Build an E-Commerce Site with Angular 5
LVL 12
Build an E-Commerce Site with Angular 5

Learn how to build an E-Commerce site with Angular 5, a JavaScript framework used by developers to build web, desktop, and mobile applications.

How to scan to a PDF file with free software - Foxit Reader
I've published three five-minute Experts Exchange video Micro Tutorials that describe terrific features in an excellent, free PDF product called PDF-XChange Editor:

How to rotate pages in a PDF with free software
How to OCR pages in a PDF with free software
How to password-protect a PDF with free software

PDF-XChange Editor has many other features in its free version, but, unfortunately, it cannot do scanning — you must purchase one of its non-free versions to get scanning functionality. Fortunately, there's another excellent, free PDF product that can perform scanning — Foxit Reader. However, the free Foxit Reader cannot do OCR, so you'll want to keep the free PDF-XChange Editor for its OCR capability, and add Foxit Reader for its scanning capability. The combination of the two products will allow you to create searchable PDFs (aka PDF Searchable Image files) with your scanner, utilizing free software.

N.B.: As with any "free" software, there may be restrictions, which are always specified in the software's licensing agreement, typically known as the End-User License Agreement (EULA). I encourage you to read the entire EULA of these products to be certain that you are in license compliance.

In order to scan, Foxit Reader requires an …
3

Expert Comment

by:Basem Khawaja
Joe if I say you are a genius. It would be an understatement. God bless you my friend:)
0
LVL 62

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Basem,
Thank you for the kind words and the video endorsement...both very much appreciated! Regards, Joe
0
What to do when PaperPort crashes, hangs, or fails to start - delete metadata with CheckPPFolders
If you are (or ever were) a Mozilla Firefox user, I suggest that you immediately head over to this Experts Exchange article:

What to do when PaperPort crashes, hangs, or fails to start - popular fix for Mozilla Firefox users

The problem discussed in that article reached epidemic proportions in July 2018. The solution proposed there is very likely to solve your problem, but if it doesn't, come back here to try the idea in this video.

Please read the paragraph below before following the instructions in the video — there are important caveats in the paragraph that I did not mention in the video.

If your PaperPort 12 or PaperPort 14 is failing to start, or crashing, or hanging, it may be because of corrupt metadata (likely) or corrupt data files, such as bad PDFs (much less likely, but possible). This video Micro Tutorial shows how to use a utility called CheckPPFolders that ships with all releases of PaperPort 12 and PaperPort 14. CheckPPFolders is able to remove all PaperPort metadata, as well as identify problem files that may be causing PaperPort to crash, hang, or fail to start. PaperPort will rebuild the metadata, but there are two caveats. First, Folder Color and Folder Notes are in the MaxDesk.ini files, so you will lose those — and there's no easy way to retain the colors and notes. Thus, if you make heavy use of Folder Color and …
4
How to password-protect a PDF with free software - PDF-XChange Editor
This video Micro Tutorial shows how to password-protect PDF files with free software. Many software products can do this, such as Adobe Acrobat (but not Adobe Reader), Nuance PaperPort, and Nuance Power PDF, but they are not free products. This video explains how to do it with excellent, free software called PDF-XChange Editor from Tracker Software Products.

1. Download PDF-XChange Editor


Visit the PDF-XChange Editor section of the Tracker Software Products website:

http://www.tracker-software.com/product/pdf-xchange-editor

Click the white-on-green Download button for either product. It doesn't matter if you download PDF-XChange Editor or PDF-XChange Editor Plus, since you'll be selecting the Free Version when you install.

Step1

2. Run downloaded installer


Run the downloaded installer and select Free Version (unless, of course, you want more features and decide to purchase the Pro or Plus Version).

Step2

3. Open a non-secured PDF file in PDF-XChange Editor


Run PDF-XChange Editor and open a PDF file that does not currently have password protection on it.

Step3

4. Open Security section of Document Properties


Click File menu.

Click Document Properties.

Click Security category.

Step4

5. Open Password Security Settings dialog


Click Security Method drop-down.

Click Password Security.

Step5

6. Fill in Password Security Settings dialog


In Options section, select Compatibility from the drop-down and what you want encrypted via the radio buttons.

In Document Passwords section, enter password to open PDF and password to change permission settings.

In Permissions section, set Printing Allowed and Changing Allowed choices via the drop-downs; enable/disable content copying and
3

Expert Comment

by:Basem Khawaja
Genius man.
0
LVL 62

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Basem,
Very nice of you to say that! Thanks for endorsing the video. Regards, Joe
0
How to add page numbers to a PDF with Adobe Acrobat XI Pro
In a recent question here at Experts Exchange, a member asked how to add page numbers to a PDF file using Adobe Acrobat XI Pro. This short video Micro Tutorial shows how to do it.

1. Click the Tools button


That will expose the Tools pane.

Step1

2. Click the Pages arrow


That will expand the Pages section.

Step2

3. Click the Header & Footer drop-down


That will show three menu choices.

Step3

4. Click the Add Header & Footer... menu item


You will now have the Add Header and Footer dialog.

Step4

5. Select the Page Number format


Click the Page Number and Date Format... link.

Step5

6. Select the font for the page number



Step6

7. Set other options


There are several other features in the dialog, including Appearance Options, Margin sizes, and Page Range Options.

8. Select the location for the page number


Click in one of these six boxes: Left Header Text, Center Header Text, Right Header Text, Left Footer Text, Center Footer Text, Right Footer Text.

Step8

9. Add the page numbers


Click the Insert Page Number button and then click OK. Note that it's also possible to insert a Date (and format it, too).

Step9
That's it! You now have page numbers in your PDF file. Remember to Save the file or do a Save As if you don't want to overwrite the original PDF.

If you find this video to be helpful, please click the thumbs-up icon below. Thank you for watching!
2
LVL 19

Administrative Comment

by:Kyle Santos
Congratulations.  Your video has been Accepted and is now published on Experts Exchange.  Feel free to share this video by selecting the social sharing icons to your left.
0
Xpdf - PDFfonts - Command Line Utility to List Fonts Used in a PDF File
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. In addition to the name of the font, it shows the font type and whether or not the font is embedded in the PDF file (and, if embedded, whether or not it is a subset), along with other font information that is discussed in the documentation file. It does this via a command line interface, making it suitable for use in batch files, programs, and scripts — any place where a command line call can be made.

1. Download the software


You may have already downloaded and unzipped the Xpdf tools while watching the first video in the Xpdf series, but if you haven't, then visit the Xpdf website. Click the Download link and then click the pre-compiled Windows binary ZIP archive to download the utilities for Windows.

Step1

2. Locate the documentation folder for the Xpdf utilities


Go to the folder where you unzipped the downloaded ZIP file and find the doc folder.

Step2

3. Read the documentation for the PDFfonts tool


Go into the doc folder and find the plain text file called pdffonts.txt.

Open it with any text editor, such as Notepad, and read it. This is the documentation for the PDFfonts tool.

Step3

4. Set up a test folder


Create a test folder.

Copy pdffonts.exe from the unzipped bin32 folder into your test folder.

Copy a couple of sample PDF files into your test folder, preferably ones with many different fonts.

Step4

5. Set up a command prompt for testing

2
LVL 19

Administrative Comment

by:Kyle Santos
Congratulations!  Your video has been Accepted and is now published on Experts Exchange.  Thank you for your contributions.
1
LVL 62

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Kyle,
Thanks for publishing and upvoting — both appreciated! Regards, Joe
1
How to create custom scanning profiles in PaperPort - Part 2
This video Micro Tutorial is the second in a two-part series that shows how to create and use custom scanning profiles in Nuance's PaperPort 14.5. But the ability to create custom scanning profiles also exists in PaperPort going back many years, so if you have an older version, such as PaperPort 11 or PaperPort 12, these videos will still be applicable for you. The first video tutorial shows how to create custom scanning profiles and reviews all the Scanner Enhancement Technology (SET) features, such as auto-straighten, delete blank pages, remove punch holes, etc. It also discusses scanning options, including Mode (B&W, Grayscale, Color), Resolution (100 DPI, 200 DPI, 300 DPI, etc.), and Size (Letter, Legal, A4, etc.). This second tutorial shows how to set the output file type for your scans, such as scanning directly to a PDF Searchable Image file, an Excel spreadsheet, or a Word document — all with text created by an automatic OCR process.

1. Run PaperPort and open the 'Output' tab of the scanning profile created in Part 1


Run PaperPort.

Click the Scan Settings button on the ribbon.

This will bring up the Scan or Get Photo pane.

Select the custom scanning profile that you created during Part 1 of this video tutorial series.

Click the Settings button.

Click the Output tab.

Step1

2. Test scanning to a PDF Image file


Click the drop-down arrow on the File type field.

Select PDF Image and click OK.

Put a document in your scanner and click the Scan button. You will now have a PDF Image
1
How to create custom scanning profiles in PaperPort - Part 1
This video Micro Tutorial is the first in a two-part series that shows how to create and use custom scanning profiles in Nuance's PaperPort 14.5. But the ability to create custom scanning profiles also exists in PaperPort going back many years, so if you have an older version, such as PaperPort 11 or PaperPort 12, these videos will still be applicable for you. This first video tutorial shows how to create (and name) custom scanning profiles (or edit existing ones) and reviews all of the Scanner Enhancement Technology (SET) features, such as auto-straighten, delete blank pages, remove punch holes, etc. It also discusses scanning options, including Mode (B&W, Grayscale, Color), Resolution (100 DPI, 200 DPI, 300 DPI, etc.), and Size (Letter, Legal, A4, etc.). The video takes a quick look at the output file type options, but that is discussed fully in Part 2 of the series.

1. Run PaperPort and bring up the 'Scan or Get Photo' pane


Run PaperPort.

Click the Scan Settings button on the ribbon.

This will bring up the Scan or Get Photo pane.

Step1

2. Create a new scanning profile or edit an existing one


To create a new scanning profile, click the New button.

To edit an existing scanning profile, click the profile you want to edit, then click the Settings button.

Step2

3. Name the new profile


Enter a name for the new profile.

If you want to copy settings from an existing profile, click the drop-down and select it.

Click the Continue button.

Step3

4. Select the Scanner Enhancement Technology (SET) features


Click the SET
0
How to OCR pages in a PDF with free software - PDF-XChange Editor
We often encounter PDF files that are pure images, that is, they do not have text characters, but instead contain only raster graphics. The most common causes of this are document scanning software and faxing software/services that create image-only PDF files rather than PDF searchable image files, the latter having the scanned or faxed images and text created by Optical Character Recognition (OCR). The solution is to perform OCR on the image-only PDFs to create text. Many software products can do this, such as ABBYY FineReader, Adobe Acrobat (but not Adobe Reader) and Nuance's OmniPage, PaperPort, and Power PDF. Some can even do it in batch mode via a command line interface. But they are all non-free products, many quite expensive. This video Micro Tutorial shows how to OCR the pages of an image-only PDF, thereby creating searchable/copyable text, with excellent, free software called PDF-XChange Editor from Tracker Software Products.

1. Download the Free Version of PDF-XChange Editor


Visit the website for PDF-XChange Editor at Tracker Software Products:

http://www.tracker-software.com/product/pdf-xchange-editor

Tick the radio button for the installer you prefer and then click the DOWNLOAD NOW button.

Step1

2. Run the downloaded installer


Run the installer that you downloaded and select the Free Version (unless, of course, you want more features and would like to purchase the Pro Version).

Step2

3. Open the document in PDF-XChange Editor


The installer creates a program group called PDF-XChange with a shortcut in it for PDF-XChange Editor
17
LVL 24

Expert Comment

by:Andrew Leniart
The one negative however is that it does NOT provide OCR capabilities without a purchase of the Pro version
Absolutely a typo Joe, thank you for correcting it, and do give PDFelements a try. It truly is a great product.
0
LVL 62

Author Comment

by:Joe Winograd, Fellow&MVE
Thanks for letting me know about the product, Andrew...I hadn't heard of it...looks very interesting! Regards, Joe
2
How to rotate pages in a PDF with free software - PDF-XChange Editor
Sometimes we receive PDF files that are in the wrong orientation. They may be sideways or even upside down. This most commonly happens with scanned or faxed documents. It is possible to rotate the view of these PDFs with the free Adobe Reader product, but it is not possible to save the PDF with the rotated pages using Adobe Reader — not even with the latest Document Cloud (DC) version (or any earlier version of Reader). To do this with an Adobe product requires the relatively expensive Adobe Acrobat (Standard or Professional). This video Micro Tutorial shows how to rotate the pages of a PDF, and save the rotated document, with excellent, free software called PDF-XChange Editor from Tracker Software Products.

1. Download the Free Version of PDF-XChange Editor


Visit the website for Tracker Software Products:

http://www.tracker-software.com/product/pdf-xchange-editor

Tick the radio button for the installer you prefer and then click the DOWNLOAD NOW button.

Step1

2. Run the downloaded installer


Run the installer that you downloaded and select the Free Version (unless, of course, you want more features and would like to purchase the Pro Version).

Step2

3. Open the document in PDF-XChange Editor


Run PDF-XChange Editor and open the sideways or upside-down document in it.

Step3

4. Run the Rotate Pages feature


Click Document menu

Click Rotate Pages

Step4

5. Select desired rotation and which pages to rotate


In the Direction drop-down, choose Clockwise 90 degrees or 180 degrees or Counterclockwise 90 degrees
4
Exploring SQL Server 2016: Fundamentals
LVL 12
Exploring SQL Server 2016: Fundamentals

Learn the fundamentals of Microsoft SQL Server, a relational database management system that stores and retrieves data when requested by other software applications.

Xpdf - PDFinfo - Command Line Utility to Retrieve Page Count and Other Information from PDF Files
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF file's Info Dictionary, as well as some other information (metadata), including the page count. We show how to isolate the page count in a plain text file, and the same method may be used to isolate other metadata fields, such as the Author and PDF Producer. PDFinfo provides a command line interface, making it suitable for use in batch files, programs, and scripts — any place where a command line call can be made.

1. Download the software.


You may have already downloaded and unzipped the Xpdf tools while watching the first video in the Xpdf series, but if you haven't, then visit the Xpdf website. Click the Download link and then click the pre-compiled Windows binary ZIP archive to download the utilities for Windows.

Step1

2. Locate the documentation folder for the Xpdf utilities.


Go to the folder where you unzipped the downloaded ZIP file and find the <doc> folder.

Step2

3. Read the documentation for the PDFinfo tool.


Go into the <doc> folder and find the plain text file called <pdfinfo.txt>.

Open it with any text editor, such as Notepad, and read it. This is the documentation for the PDFinfo tool.

Step3

4. Set up a test folder.


Create a test folder.

Copy <pdfinfo.exe> from the unzipped <bin32> folder into your test folder.

Copy a sample PDF file into your test folder (in the video and the screenshots below, the file is called test.pdf, which is a PDF file created from my EE article, Windows 10 uses YOUR computer to help distribute itself).

Step4

5. Set up a command prompt for testing.

3
Convert Scanned Image-Only PDF Files to PDF Searchable Image Files via OCR with Power PDF Advanced
In this video, we show how to convert an image-only PDF file into a PDF Searchable Image file, that is, a file with both the image (typically from scanning) and text, which is created in an automated fashion with Optical Character Recognition (OCR) software. To do this, we will set up a Watched Folder, such that whenever an image-only PDF file arrives in the Watched Folder, it will automatically be converted to a PDF Searchable Image file. We will achieve this using Power PDF, the newest product from the Document Imaging division of Nuance Communications. There are two editions of Power PDF — Standard and Advanced. The Watched Folder feature is in the Advanced edition only.

1. Download and install the trial software



Visit the Nuance website at:

http://www.nuance.com/for-business/document-imaging-and-scanning/power-pdf-converter/index.htm

Click the free trial link, which takes you here:

http://www.nuance.com/for-business/imaging-solutions/document-conversion/power-pdf-converter/free-trial/index.htm

Fill out the short form and submit it.

Download the trial software and install it.

Step1.jpg

2. Run the program and invoke the Watched Folder feature



Run the program by clicking Start>All Programs>Nuance Power PDF Advanced>Power PDF Advanced.

Invoke the Watched Folder feature by clicking the Advanced Processing menu, then the drop-down on the Batch Controls ribbon button, then Watched Folder.

Step2.jpg

3. Configure the Watched Folder settings



Tick the Enable Watched Folder box.

Click the Source button and Browse to the folder that you want as the Watched Folder.

Tick the
1
Bates Stamping/Numbering of PDF Files with Power PDF Advanced
In this video, we show how to perform Bates Numbering/Stamping of PDF documents using Power PDF Advanced, the newest product from the Document Imaging division of Nuance Communications. There are two editions of Power PDF — Standard and Advanced. The Bates Numbering/Stamping feature is in the Advanced edition only.

1. Download the trial software

Visit the Nuance website at:

http://www.nuance.com/for-business/document-imaging-and-scanning/power-pdf-converter/index.htm

Click the "Free trial" button, fill out the short form, and submit it.

Download the trial software and install it.
free trial

2. Run the program and invoke the Bates Numbering/Stamping feature.

Run the program by clicking Start>All Programs>Nuance Power PDF Advanced>Power PDF Advanced.

Invoke the Bates Numbering/Stamping feature by clicking the Edit menu, then the Bates Numbering button on the ribbon. This shows the Add and Remove choices — click Add.Add Bates feature

3. Add an entire folder of documents to be Bates Numbered/Stamped.

Click the Add Folders button and browse to the folder containing the PDF files to be Bates Numbered/Stamped.Add folder

4. Set the output options.

Click the Output button and set the output options, including the destination folder and the file naming rules.Set options

5. Set the order of the documents to be Bates Numbered.

Select a document and then click the Move Up and/or Move Down buttons to place it in the order that you want.

You may also click the Remove button to delete it from the list and the Preview button to look at it.Order documents

6. Configure the Bates options in the Header and Footer.

2

Expert Comment

by:WSPatton
Joe,

Is there a way to Have the FileName displayed in the Header, but in such a way that it EXCLUDES the extension (i.e. the ".pdf")?

I have hundreds of scanned PDFs that I will first batch rename using/assigning unique Exhibit numbers and then want to use a feature like Power PDF's Header & Footer Tool to have the FileName displayed in the upper right corner excluding the ".pdf", and the page number displayed in the lower right corner.  Below is a picture of what I want and attached is a PDF of what I have been able to do so far.  Any help is most welcome.Example of FileName displayed in Header.  This is what I want to do.
18-March-2016 Update:

Joe,

I also reached out to Nuance support and as yet they have not given me any useful feedback.

If it is useful, below is a link to my support ticket thread with Nuance:

http://nuance.custhelp.com/app/account/questions/detail/i_id/2307681/track/AvNquQo9Dv8S~cwxGmwe~yJ9yD0qSS75Mv_d~zj~PP9U

I am able to add headers with a FileName and footers with a Page number.

But my problem is that I want the %FileName% header to display the name of the File in such a way that it EXCLUDES the “.PDF” extension. I want the Headers to ONLY display: "Exhibit 002", "Exhibit 003", "Exhibit 004", etc..

I realize that I could manually paste the file name into the header field, but since I have hundreds of PDFs which I have to assign "Exhibit #" file names, I want to then automate the Header process by using a macro very much like Nuance's %FileName% macro, but with the appropriate code that STRIPS AWAY the ".PDF"

I look forward to hearing from you.
C-Ex-D015.pdf
0
LVL 62

Author Comment

by:Joe Winograd, Fellow&MVE
Hi WSPatton,

Currently, this can't be done. Two hours after reading your comment, I sent this email to my contacts at Nuance:

----- Begin message to Nuance -----
With Bates Numbering in Power PDF Advanced, inserting the macro for the file name creates the variable %FileName%. That variable contains the file name without the path but with the file extension (i.e., .PDF). Are there variables with other forms of the file name, such as the path, the file name without extension, etc. (the latter is the most important and the one I'm specifically looking for at this time)? If not, please consider the macros below for a future release. Thanks, Joe

%FileName%
File name without its path but with its extension. This is its current definition, so users already using this macro will see no change.

%FileNameNoExt%
The file name without its path, dot, and extension. As mentioned above, this is actually the main reason for this request. I have users who want the Bates stamp to contain the file name, but not the ".pdf". I included the two macros below for the sake of completeness, but right now I'd be happy with just this one new macro. Also, if there's a work-around, I'd love to hear it - can you think of any way to get the file name without the dot and extension onto each page?

%FilePath%
The file path, including drive letter with colon, but without the final backslash, even for root folders. Thus, %FilePath% followed by "\" followed by %FileName% will create the fully qualified file name.

%FileExtension%
The file extension without the dot. Presumably, this will always be PDF, unless PPA in the future can do Bates Numbering on other file types.
----- End message to Nuance -----

I'll post back here if I receive a reply from them. Btw, I was unable to access your support ticket. After logging into support and clicking on the link, I received a "Permission Denied" message. Seems that ticket threads may be viewed only by Nuance and the submitter. Regards, Joe
0
Xpdf - PDFtoText - Convert PDF Files to Plain Text Files
In this third video of the Xpdf series, we discuss and demonstrate the PDFtoText utility, which converts PDF files into plain text files. It does this via a command line interface, making it suitable for use in batch files, programs, and scripts — any place where a command line call can be made.

1. Download and install the software.

You may have already downloaded and installed the Xpdf tools while watching the first  or second video in the Xpdf series , but if you haven't, then visit the Xpdf website at:

http://www.foolabs.com/xpdf/

Click the Download link and then click the pre-compiled Windows binary ZIP archive to download the Xpdf utilities for Windows.
precompiled binaries

2. Locate the documentation folder for the Xpdf utilities.

Go to the folder where you unzipped the downloaded ZIP file and find the <doc> folder.
documentation folder

3. Read the documentation for the PDFtoText tool.

Go into the <doc> folder and find the plain text file called <pdftotext.txt>.

Open it with any text editor, such as Notepad, and read it. This is the documentation for the PDFtoText tool.
read me

4. Set up a test folder.

Create a test folder.

Copy <pdftotext.exe> from the unzipped <bin32> folder into your test folder.

Copy a sample PDF file into your test folder (in the video and the screenshots below, the file is called <RMP.pdf>).
test folder

5. Set up a command prompt for testing.

Open a command prompt window.

Navigate to your test folder.

Issue a DIR command in the command prompt to be sure that only two files are in it - the PDFtoText executable and the sample PDF file.
cmd prompt dir

6. Run the PDFtoText utility on the sample PDF file.

In the command prompt window, enter the following command:

pdftotext -layout samplefilename.pdf
command line

7. Verify that the text file that was created.

10
LVL 24

Expert Comment

by:Andrew Leniart
Great tutorial series. This will be very handy for me!
0
LVL 62

Author Comment

by:Joe Winograd, Fellow&MVE
Hi Andrew,
I'm glad to hear that my Xpdf series will be useful for you. This particular one, PDFtoText, is the one that I use the most in my custom programs. Cheers, Joe
P.S. Thanks for the endorsement!
0
PaperPort Send To Bar - Part 1
This video is the first in a two-part series that discusses PaperPort's "Send To Bar" feature . This first video tutorial explains the purpose of the Send To Bar, how to use it, and how to hide unwanted items that are automatically created on it when PaperPort is installed. The second video tutorial in the series discusses how to add a custom icon/program to the Send To Bar.

1. Locate the Send To Bar at the bottom of the PaperPort app


Run PaperPort.

Look at the bottom of the PaperPort app and you will see something like this (depending on the version of PaperPort, the other apps that are installed, and the viewing options for the Send To Bar):

step1.jpg

2. Send an item to the Send To Bar


There are two ways to do this:

(i) Click on a desktop item (for example, a JPG file) and then click an icon on the Send To Bar (for example, Microsoft Paint).

(ii) Drag-and-drop an item (for example, a JPG file) onto an icon on the Send To Bar (for example, Microsoft Paint).

step2.jpg

3. Perform an operation on the item in the program that was launched


Continuing the example from above, let's say you have a JPG in Microsoft Paint:

step3a.jpgRotate it and save it:

step3b.jpg

4. Confirm that the item was changed in PaperPort


After saving the file in Step 3, you will be returned to PaperPort, where you should see that the item was changed (such as a rotated image).

step4.jpg

5. Hide unwanted icons on the Send To Bar


Right-click any icon on the Send To Bar and then click Send To Options.

Starting at the top of the vertical pane of icons, click on each one that you want to hide and un-check the box that says "Include icon on Send To bar".

Click OK.

step5.jpg

6. Confirm that unwanted icons have been removed from Send To Bar


The icons that you chose to hide should no longer appear on the Send To Bar
2

OCR

554

Solutions

1K

Contributors

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, including passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Top Experts In
OCR
<
Monthly
>