Carl Laskowski
asked on
OCR using microsoft office 2016
I want to scan a document and use ocr so I can edit it in word 2016
ASKER
I was hoping the OCR solution would be native to Office 2016 or part of HP Deskjet 2542 support software rather than downloading additional software. Is that note the case?
> hoping the OCR solution would be native to Office 2016
There used to be a product in Office called Microsoft Office Document Imaging (MODI), but it was removed in Office 2010 (it was bundled with Office 2003 and 2007). Here's a link about it:
https://support.microsoft.com/en-us/help/982760/install-modi-for-use-with-microsoft-office-2010
There's been some luck installing it in Office 2010, less luck in 2013, and I'm not aware of any success in 2016 (or 2019), although I suppose it's possible (but I doubt it). Btw, MODI was based on software that Microsoft OEM'd from Nuance (ScanSoft at the time) and is the same software that is in OmniPage, PaperPort, and Power PDF.
> part of HP Deskjet 2542 support software
It's possible, but I'm not familiar with that particular model. Some scanners and all-in-one devices include software with OCR capability; others do not. You'll have to check the software that was bundled with the 2542 (I was hoping to provide you with a download link for the 2542 at the HP support site, but it is down now...I suggest you check it later). If OCR was not bundled with it, I recommend the (free!) combination of Foxit Reader to scan and PDF-XChange Editor to OCR. Regards, Joe
There used to be a product in Office called Microsoft Office Document Imaging (MODI), but it was removed in Office 2010 (it was bundled with Office 2003 and 2007). Here's a link about it:
https://support.microsoft.com/en-us/help/982760/install-modi-for-use-with-microsoft-office-2010
There's been some luck installing it in Office 2010, less luck in 2013, and I'm not aware of any success in 2016 (or 2019), although I suppose it's possible (but I doubt it). Btw, MODI was based on software that Microsoft OEM'd from Nuance (ScanSoft at the time) and is the same software that is in OmniPage, PaperPort, and Power PDF.
> part of HP Deskjet 2542 support software
It's possible, but I'm not familiar with that particular model. Some scanners and all-in-one devices include software with OCR capability; others do not. You'll have to check the software that was bundled with the 2542 (I was hoping to provide you with a download link for the 2542 at the HP support site, but it is down now...I suggest you check it later). If OCR was not bundled with it, I recommend the (free!) combination of Foxit Reader to scan and PDF-XChange Editor to OCR. Regards, Joe
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you both for looking into this. It worked just as you said Karen. I am good to go.
You're welcome, Carl. I never use OneNote (or Evernote). Can it scan? Or do you have to print to it (as Karen said) after scanning with some other software? The reason I ask is that your question says, "want to scan a document and use ocr". Regards, Joe
I use camscanner to scan docs to PDF right from my phone for free.
OneNote is a great tool
OneNote is a great tool
Hi Karen,
I guess you're saying that OneNote cannot scan — right? Regards, Joe
I guess you're saying that OneNote cannot scan — right? Regards, Joe
ASKER
Hi Joe. I used windows fax and scanner to get the image from my HP scanner. I saved it, opened a tab in One Note and dragged and dropped the scanned file there. Then on a right click there is a "copy to text" choice. Next I pasted it into a ne blank Word document and it was done. A little tricky but it works. The resulting docx file lost some formatting and there might be some misread characters but I was impressed with the results.
Hi Carl,
Thanks for the update. Haven't heard back from Karen, but I guess you're confirming that OneNote cannot scan. Of course, Windows Fax and Scan is not a good tool for lots of scanning, and the process you mention is certainly convoluted, but for a one-off, it's fine.
As you've discovered, maintaining the formatting in PDF-to-Word conversion is an issue. I've had good (not perfect) results with this free online tool:
http://www.pdftoword.com/
If you prefer a local install, I've also had good (also not perfect) results with this free tool:
http://www.boxoft.com/pdf-to-word/
You may get better results with non-free products. I've gotten better (but still not perfect) results with Nuance's PaperPort and Power PDF:
https://www.nuance.com/print-capture-and-pdf-solutions/optical-character-recognition/paperport-for-pc.html
http://www.nuance.com/for-business/document-imaging-and-scanning/power-pdf-converter/index.htm
There's a free trial for PaperPort and the Power PDF Advanced edition (but not Standard) so you can see how well it works for you before buying it:
https://www.nuance.com/print-capture-and-pdf-solutions/optical-character-recognition/paperport-for-pc/trial-version.html
http://www.nuance.com/for-business/imaging-solutions/document-conversion/power-pdf-converter/free-trial/index.htm
Btw, PaperPort and Power PDF can scan directly to Word...one step!
Another good (non-free) product is Able2Extract PDF Converter:
http://www.investintech.com/prod_downloadsa2e.htm
It also offers a free trial.
The first link in this post is to the (free) Nitro cloud. Nitro is a well-known name in PDF tools and their Nitro Pro has a PDF-to-Word feature:
http://www.nitropdf.com/pro/features/convert-export
There's also a free trial for this, but I've never used it, so can't vouch for its performance. However, it uses the same engine as the online tool, which I have used and is very good, so I would expect the same of Nitro Pro.
One more non-free product (but reasonably priced at $39) is CAD-KAS's PDF-to-Word:
http://www.cadkas.com/downengpdf9.php
I haven't used this product, but I have used their PDF Editor Objects, which is excellent. Based on the quality of PDF Editor Objects, I think that their PDF-to-Word is worth a try, and there's a free trial:
http://www.cadkas.com/pdf2word!.exe
It probably goes without saying, but Adobe Acrobat can do it — both Standard and Professional (but not Reader). As with everything, results aren't perfect.
I've been on previous threads here at EE where other experts have recommended these three (free) online tools:
http://www.convertpdftoword.org
http://www.pdfonline.com/pdf-to-word-converter
http://www.wondershare.net/pdf-converter/pdf-to-word-converter.html
I can't personally vouch for these, but based on the positive comments from other members, I'm passing them along for your consideration.
No matter which way you go, keep in mind that PDF-to-Word conversion is tricky business – maintaining the formatting/layout is tough stuff! I haven't found anything that is perfect, and results vary from one document to the next. So my suggestion is to put some, or all, of these products on your short list for evaluation. Define a few test docs – your docs! Compare the resulting Word files to see which, if any, of the tools produces Word files that are satisfactory. Of course, if all of this was a one-off, forget everything in here. :) Regards, Joe
Thanks for the update. Haven't heard back from Karen, but I guess you're confirming that OneNote cannot scan. Of course, Windows Fax and Scan is not a good tool for lots of scanning, and the process you mention is certainly convoluted, but for a one-off, it's fine.
As you've discovered, maintaining the formatting in PDF-to-Word conversion is an issue. I've had good (not perfect) results with this free online tool:
http://www.pdftoword.com/
If you prefer a local install, I've also had good (also not perfect) results with this free tool:
http://www.boxoft.com/pdf-to-word/
You may get better results with non-free products. I've gotten better (but still not perfect) results with Nuance's PaperPort and Power PDF:
https://www.nuance.com/print-capture-and-pdf-solutions/optical-character-recognition/paperport-for-pc.html
http://www.nuance.com/for-business/document-imaging-and-scanning/power-pdf-converter/index.htm
There's a free trial for PaperPort and the Power PDF Advanced edition (but not Standard) so you can see how well it works for you before buying it:
https://www.nuance.com/print-capture-and-pdf-solutions/optical-character-recognition/paperport-for-pc/trial-version.html
http://www.nuance.com/for-business/imaging-solutions/document-conversion/power-pdf-converter/free-trial/index.htm
Btw, PaperPort and Power PDF can scan directly to Word...one step!
Another good (non-free) product is Able2Extract PDF Converter:
http://www.investintech.com/prod_downloadsa2e.htm
It also offers a free trial.
The first link in this post is to the (free) Nitro cloud. Nitro is a well-known name in PDF tools and their Nitro Pro has a PDF-to-Word feature:
http://www.nitropdf.com/pro/features/convert-export
There's also a free trial for this, but I've never used it, so can't vouch for its performance. However, it uses the same engine as the online tool, which I have used and is very good, so I would expect the same of Nitro Pro.
One more non-free product (but reasonably priced at $39) is CAD-KAS's PDF-to-Word:
http://www.cadkas.com/downengpdf9.php
I haven't used this product, but I have used their PDF Editor Objects, which is excellent. Based on the quality of PDF Editor Objects, I think that their PDF-to-Word is worth a try, and there's a free trial:
http://www.cadkas.com/pdf2word!.exe
It probably goes without saying, but Adobe Acrobat can do it — both Standard and Professional (but not Reader). As with everything, results aren't perfect.
I've been on previous threads here at EE where other experts have recommended these three (free) online tools:
http://www.convertpdftoword.org
http://www.pdfonline.com/pdf-to-word-converter
http://www.wondershare.net/pdf-converter/pdf-to-word-converter.html
I can't personally vouch for these, but based on the positive comments from other members, I'm passing them along for your consideration.
No matter which way you go, keep in mind that PDF-to-Word conversion is tricky business – maintaining the formatting/layout is tough stuff! I haven't found anything that is perfect, and results vary from one document to the next. So my suggestion is to put some, or all, of these products on your short list for evaluation. Define a few test docs – your docs! Compare the resulting Word files to see which, if any, of the tools produces Word files that are satisfactory. Of course, if all of this was a one-off, forget everything in here. :) Regards, Joe
ASKER
Thanks Joe. For now it is a one-off but one never knows if I will need a more heavy duty solution in OCR.
You're welcome, Carl. Happy to help. Have a great weekend! Regards, Joe
Lots of software can do this. If you're looking for a free solution, the free Foxit Reader supports TWAIN and WIA scanners:
https://www.foxitsoftware.com/pdf-reader/
It can scan to PDF, but just to image-only PDF, i.e., not searchable PDF, as shown in this five-minute EE video Micro Tutorial:
How to scan to a PDF file with free software - Foxit Reader
It also does other good stuff, as shown in another five-minute EE video Micro Tutorial:
How to put a date-time stamp on a PDF file with free software - Foxit Reader
Since you want to create searchable text in the PDFs to move into Word, you'll need a product that does OCR, such as the free PDF-XChange Editor:
https://www.tracker-software.com/product/pdf-xchange-editor
Its OCR feature is described in this 5-minute EE video Micro Tutorial:
How to OCR pages in a PDF with free software
It can also do many other tasks. Here are two more 5-minute EE video Micro Tutorials discussing two more great features in the free version:
How to rotate pages in a PDF with free software
How to password-protect a PDF with free software
However, the free version cannot scan. So, unfortunately, the free Foxit Reader can scan, but not OCR, while the free PDF-XChange Editor can OCR, but not scan. Thus, you'll need both products if you want to create searchable PDFs to move into Word, i.e., Foxit Reader to scan and PDF-XChange Editor to OCR.
In the not-free arena, this other five-minute EE video Micro Tutorial shows another product, Power PDF:
Convert Scanned Image-Only PDF Files to PDF Searchable Image Files via OCR with Power PDF Advanced
And these EE articles show Power PDF and another (not-free) Nuance product, PaperPort:
Batch Conversion of PDF, TIFF, and Other Image Formats via Command Line Interface to PDF, PDF Searchable, and TIFF with Power PDF Advanced
PaperPort - How To Create Searchable PDF Files
There are many more out there, such as ABBYY FineReader, Nuance OmniPage, and others, but these should get you going. Regards, Joe