cpatte7372
asked on
Searching Within Documents
Hi Experts,
Not really sure what category this question goes under.
Basically, can someone please tell me what application will allow me to search for words in the attached document?
For example, lets say I want to search for the word ICMP in the attachment using Firefox I wouldn't be able to do it. So is there any application that can search for words in a document, created like this.
I should mention that the document was created using a snapshot capture program called 'snagit'. It is similar to 'print screen'. The main difference is that the capture can be saved as jpg, gif, pdf, png and lots of other formats.
Therefore, if you guys/girls determine that its not possible in the .png format, can you please advise what format it is possible to search with - I've already tried .PDF but no luck.
You help will be greatly appreciated.
Cheers
Carlton
642-9021.png
Not really sure what category this question goes under.
Basically, can someone please tell me what application will allow me to search for words in the attached document?
For example, lets say I want to search for the word ICMP in the attachment using Firefox I wouldn't be able to do it. So is there any application that can search for words in a document, created like this.
I should mention that the document was created using a snapshot capture program called 'snagit'. It is similar to 'print screen'. The main difference is that the capture can be saved as jpg, gif, pdf, png and lots of other formats.
Therefore, if you guys/girls determine that its not possible in the .png format, can you please advise what format it is possible to search with - I've already tried .PDF but no luck.
You help will be greatly appreciated.
Cheers
Carlton
642-9021.png
Carlton,
The issue is that a screen capture (such as via Snagit) is simply a bitmap/image. It needs to be converted to (searchable) text via a process called Optical Character Recognition (OCR). There are many fine OCR packages out there. Two highly-regarded ones are ABBYY FineReader and Nuance's OmniPage:
http://www.abbyy.com/
http://nuance.com/for-individuals/by-product/omnipage/index.htm
Another approach is to use an imaging/scanning package, such as Nuance's PaperPort:
http://nuance.com/for-individuals/by-product/paperport/index.htm
PaperPort can take an image, including all of the ones you mentioned (JPG, GIF, PDF, PNG), and via a <Save As> command automatically invoke OCR on it and create a PDF Searchable Image file, which contains both the image and a layer of text created by the OCR (btw, under the covers, PaperPort utilizes OmniPage OCR). The latest version is PP14, which just came out in August. The main enhancement is cloud support, which you probably don't need. The new version is fairly expensive, but you can get the previous version, which is 12 (yes, they were superstitious and skipped 13), as a download at Newegg for $39.99:
http://www.newegg.com/Product/Product.aspx?Item=N82E168168677800SF
The Newegg download is likely to be 12.0. Do not install that. Instead, read my EE article on how to upgrade to 12.1 (free!):
https://www.experts-exchange.com/Web_Development/Document_Imaging/A_6537-PaperPort-Upgrade-How-to-download-and-install-updated-versions-of-PaperPort-11-and-12.html
If you're looking for FREE, here are two possibilities, but I've never tried either, as I've been a long-time user of PaperPort. So I have no idea if either of these is any good, but may be worth a spin if you don't want to spend money on an OCR package or PaperPort (I think the latter at 40 bucks is the way to go):
http://www.freeocr.net/
http://www.simpleocr.com/
As a disclaimer, I want to emphasize that I have no affiliation with any companies mentioned in this post, or any financial interest in them whatsoever. Regards, Joe
The issue is that a screen capture (such as via Snagit) is simply a bitmap/image. It needs to be converted to (searchable) text via a process called Optical Character Recognition (OCR). There are many fine OCR packages out there. Two highly-regarded ones are ABBYY FineReader and Nuance's OmniPage:
http://www.abbyy.com/
http://nuance.com/for-individuals/by-product/omnipage/index.htm
Another approach is to use an imaging/scanning package, such as Nuance's PaperPort:
http://nuance.com/for-individuals/by-product/paperport/index.htm
PaperPort can take an image, including all of the ones you mentioned (JPG, GIF, PDF, PNG), and via a <Save As> command automatically invoke OCR on it and create a PDF Searchable Image file, which contains both the image and a layer of text created by the OCR (btw, under the covers, PaperPort utilizes OmniPage OCR). The latest version is PP14, which just came out in August. The main enhancement is cloud support, which you probably don't need. The new version is fairly expensive, but you can get the previous version, which is 12 (yes, they were superstitious and skipped 13), as a download at Newegg for $39.99:
http://www.newegg.com/Product/Product.aspx?Item=N82E168168677800SF
The Newegg download is likely to be 12.0. Do not install that. Instead, read my EE article on how to upgrade to 12.1 (free!):
https://www.experts-exchange.com/Web_Development/Document_Imaging/A_6537-PaperPort-Upgrade-How-to-download-and-install-updated-versions-of-PaperPort-11-and-12.html
If you're looking for FREE, here are two possibilities, but I've never tried either, as I've been a long-time user of PaperPort. So I have no idea if either of these is any good, but may be worth a spin if you don't want to spend money on an OCR package or PaperPort (I think the latter at 40 bucks is the way to go):
http://www.freeocr.net/
http://www.simpleocr.com/
As a disclaimer, I want to emphasize that I have no affiliation with any companies mentioned in this post, or any financial interest in them whatsoever. Regards, Joe
ASKER
Hey joewinograd,
Thats is brilliant mate. That is exactly what I need.
Can't thank you enough.
Cheers
Thats is brilliant mate. That is exactly what I need.
Can't thank you enough.
Cheers
Carlton,
You're welcome. I do a lot of screen captures (with PrintScreen, not Snagit, but it's the same result) and the OCR process via PaperPort to create a Searchable PDF Image file works very well (even at the relatively low resolution of screen captures – in an ideal world, I like 300 DPI for OCR). You can then search for the text, copy/paste it, etc. Cheers, Joe
You're welcome. I do a lot of screen captures (with PrintScreen, not Snagit, but it's the same result) and the OCR process via PaperPort to create a Searchable PDF Image file works very well (even at the relatively low resolution of screen captures – in an ideal world, I like 300 DPI for OCR). You can then search for the text, copy/paste it, etc. Cheers, Joe
ASKER
joewinograd
I'm thinking your familiar with the application. Can you guide me to scanning a document in OCR to allow me search a PDF?
Cheers
Carlton
I'm thinking your familiar with the application. Can you guide me to scanning a document in OCR to allow me search a PDF?
Cheers
Carlton
ASKER
BTW, joewinograd, I'm referring to PaperPort
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
joewinograd, figured it out.. This application is the bizniz.
Yep, considering its relatively modest cost, it's a very robust imaging/scanning package. Cheers, Joe
ASKER
Brilliant
ASKER
Cheers