• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2943
  • Last Modified:

Need powershell script to scan muutple pdf's for keywords

Greeting Experts,

I am in need of a simple PowerShell script to scan a folder full of pdf’s (2000 +) for keywords in the text of each one… Does somebody have a script or point me into the direction where I can find one script to complete this task…
0
Mike
Asked:
Mike
  • 3
  • 3
  • 2
  • +1
1 Solution
 
Dan CraciunIT ConsultantCommented:
Why do you need a powershell script?
You can achieve the same goal using Windows search or any other piece of software that can do text search.

FWIW, on Windows, I use Notepad++ to search for text in folders.

HTH,
Dan
0
 
MikeSecurityAuthor Commented:
the documents I am trying to scan are pdf's  and using the Windows Search only scans for names of the documents.. not the text inside of the documents....  that is what i am trying to do....
0
 
Dan CraciunIT ConsultantCommented:
OK. Here's how you do search in files in Notepad++:
Search in files in Notepad  You actually can use Windows Search to find in files, but with Notepad++ you have access to regular expressions, if need arises.
You can get Notepad++ for free from here: http://notepad-plus-plus.org/

HTH,
Dan
0
NEW Internet Security Report Now Available!

WatchGuard’s Threat Lab is a group of dedicated threat researchers committed to helping you stay ahead of the bad guys by providing in-depth analysis of the top security threats to your network.  Check out this quarters report on the threats that shook the industry in Q4 2017.

 
footechCommented:
BTW, you can scan inside the .PDFs with Windows Search as long as you have the right iFilter.  For 64-bit systems, Adobe has their version 11.
http://www.adobe.com/support/downloads/detail.jsp?ftpID=5542
If you have a 32-bit system, the iFilter comes with Adobe Reader.
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
Dan,
I just tried to search the contents of PDFs with the latest Notepad++ (6.5.1) and it doesn't work. The PDF files do have text...searches with Adobe Reader (and other search tools) find the text, but not NPP. Please try it on your end and let me know your results. Thanks, Joe
0
 
MikeSecurityAuthor Commented:
I did try to use notepad ++ and was unsuccessfully when I tried to scan the list of pdf's . after doing little bit of digging , I found a article that shows how to scan using adobe reader  

URLhttp://www.ghacks.net/2011/04/02/how-to-search-multiple-pdf-documents-at-once/
0
 
Dan CraciunIT ConsultantCommented:
My bad. Was under the impression that PDF's conform to some xml standard, so they are text files with pictures encoded as binary (something like emails).

Turns out I was wrong: PDF's are binary files and the text is not directly readable from a text editor.

I apologize, I was spreading misinformation.
0
 
MikeSecurityAuthor Commented:
Hey, you helped point me in the right direction.. thanks...
0
 
Joe Winograd, Fellow&MVEDeveloperCommented:
amstoots,
Yes, Adobe Reader can do it, as can other PDF readers/viewers (such as Foxit Reader and PDF-XChange Viewer), as well as many search products, such as dtSearch and X1, as well as the built-in Windows Search 4 (included with Vista/W7/W8 and available as a free download for XP).

Dan,
Thanks for confirming. Would be a nice enhancement for NPP7. :)

Regards, Joe
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

NEW Internet Security Report Now Available!

WatchGuard’s Threat Lab is a group of dedicated threat researchers committed to helping you stay ahead of the bad guys by providing in-depth analysis of the top security threats to your network.  Check out this quarters report on the threats that shook the industry in Q4 2017.

  • 3
  • 3
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now