Search Directory for Values in a List

I want to create a list of keywords (call it search.txt) and have a search of a specific path and subdirectories look in all files for the occurrence of the keywords.

For instance, the search.txt file would contain I_FI_REJECT, M_GRP_CLIENTS, C_UNSUCCESS_CONTACTS (each on a line), and if I was looking in the folder "C:\MyCode\" all files in that folder (and subfolders) would be searched for ANY of the keywords. I would like the search results to show the keyword, the filename that contains it (with full path), the contents of the line containing the keyword and, if possible, the line number that contains the keyword. I am certain showing the line number might be considerably more work and possibly a lot more overhead, so that can be omitted.

If one of the keywords is not found in any of the files, I would like a message printed, something like "keyword" was not found.
LVL 15
dbbishopAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

footechCommented:
Luckily, Select-String does pretty much all of this (even the line numbers) already.  The last requirement is actually the more difficult one.  Here's two slightly different variations.  One might perform better than the other for you.
$startFolder = "c:\temp"
$patterns = "I_FI_REJECT", "M_GRP_CLIENTS", "C_UNSUCCESS_CONTACTS"
$patterns | ForEach `
{
    $pattern = $_
    Get-ChildItem $startFolder -Recurse -force -ErrorAction SilentlyContinue| Where { !($_.PsIsContainer) } |
     Select-String -Pattern $pattern -SimpleMatch |
     Select Path,LineNumber,Pattern,Line
} | Tee-Object -Variable result

$matched =  $result | ForEach {$_.pattern} | Select -Unique
Compare-Object $patterns $matched -PassThru | ForEach `
{ Write-Output """$_"" was not found" }



$startFolder = "c:\temp"
$patterns = "I_FI_REJECT", "M_GRP_CLIENTS", "C_UNSUCCESS_CONTACTS"
Get-ChildItem $startFolder -Recurse -force -ErrorAction SilentlyContinue| Where { !($_.PsIsContainer) } |
 Select-String -Pattern ($patterns -join "|") |
 Select Path,LineNumber,@{N="Pattern";E={$_ | %{$_.matches} | %{$_.value}}},Line |
 Tee-Object -Variable result

$matched =  $result | ForEach {$_.pattern} | Select -Unique
Compare-Object $patterns $matched -PassThru | ForEach `
{ Write-Output """$_"" was not found" }

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
dbbishopAuthor Commented:
I'll play with these, but is there an easy way to have the keywords be in a file (one on each line) and have that read into $patterns? I have about 200 I need to search for (yeah, I know, this will probably take awhile, but faster than doing it manually).
0
footechCommented:
Sure, just change the line to define $patterns as
$patterns = Get-Content search.txt
0
Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

dbbishopAuthor Commented:
Another issue I have is the output appears as such:
X:\MyFolder\XX\Current\CONV...                                   82 STATES                                  LEFT OUTER JOIN $(SOURCE).db...

Open in new window

which means I don't see the filename (because it is truncated within the path) and I only see a few characters of the line containing the string (which is less important, but would be nice to have). Any way to expand the output width so I can see all the data?
0
footechCommented:
You could play around and pipe the results from Tee-Object into Format-Table using -autosize parameter, or you could pipe into Format-List, or you could pipe to Export-CSV to generate a file so you wouldn't be constrained by the console width.
0
dbbishopAuthor Commented:
Thank you very much. Saved me a lot of time trying to do this manually.
0
dbbishopAuthor Commented:
One more quick question, if you don't mind. This is considerably more than I've done with PS and I am trying to understand it, but having a little difficulty. The MAIN thing I want to do is actually identify keywords that are not found. The reason for printing out the contents of lines that contained the keywords was to examine them and make sure that the keyword was part of another word (e.g. I_FIS and not I_FIS_ERRORS).
That said, how difficult would it be to modify this to only show keywords that were not matched in the folder being searched? I've been trying to accomplish it but keep getting strange results.
0
dbbishopAuthor Commented:
I'll post as a separate question if you'd rather.
0
footechCommented:
"The reason for printing out the contents of lines that contained the keywords was to examine them and make sure that the keyword was part of another word (e.g. I_FIS and not I_FIS_ERRORS)."
 - this is confusing to me, your statement and the example seem to contradict each other.  Can you clarify when "I_FI_REJECT" (for example) should produce a match and when it shouldn't?  If we don't care at all about what is around it, then things are easy, otherwise we need info on the surrounding characters to form the proper regex pattern to match against.

If you want to show keywords for which a match was not found, I think a lot of the commands will be the same, but different logic will have to be used.

Given the two points above, I think it'd be best to open a new question and I'll participate if I can.  It wouldn't be a bad idea to point the new question back at this thread, and update this thread with a link to the new question.
0
dbbishopAuthor Commented:
footech,
I seemed to have jumped the gun, but I am sure there will be a simple fix.

There does seem to be an issue, with both scripts. I am getting XXX not found, when the keyword definitely is in the set of data I am examining. Furthermore, there seems to be a case issue, in that the code is not case insensitive.

One of my keywords is CLIENTS, and it is found, but then I have "Clients" and "clients" in my source data that I am searching, and the report comes back (after having found CLIENTS, Clients and clients) and reports that Clients and clients was not found.

I need the whole thing to be case insensitive, so if I am searching for the keyword CLIENT, it won't report that Client wasn't found, if I have dbo.Client in my code.
0
dbbishopAuthor Commented:
I'll open a new question regarding the other issue, but the case issue should probably be handled here.
0
footechCommented:
I'll have to double-check later tonight, but all the comparisons are case-insensitive.  This may be giving unexpected results with the final compare which generates the no-match list.
0
footechCommented:
I'm not seeing any case-sensitivity anywhere in my testing.  Select-String has a parameter to make it case-sensitive, but it is case-insensitive by default.  The same is true for Compare-Object.

Only words/patterns that are in the search.txt file can be reported as not found, and it will only be reported if that word was not found in any file in the folder structure processed.  The only way that I can see a match not being found when present in a file is if access to the file is denied, and no error would be shown because of the -ErrorAction parameter set to SilentlyContinue in the Get-ChildItem command.

If you have a set of sample files and search terms that you can reproduce the issue with and provide to me I can take a look, but nothing I'm seeing in the code or my testing is showing any problem.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Powershell

From novice to tech pro — start learning today.