Exchange email search for specific word and no variant

I need to find a tool that will let me do very specific searches of users mailboxes in Exchange 2003. I have the mailboxes in pst form and can open them in Outlook to use Advanced Search, but that doesn't do what I need.  The problem is, if I have the word soft as a search term, the results will return not only soft as a single word, but also any word containing that string, like microsoft, software, softball, etc.  I tried enclosing it in quotation marks, but that didn't help.

Is there a way to do that in Outlook Advanced Search, or does anyone know a tool I can use to do that type of search?  I looked at Lucid8's Digiscope, but their tech support said they can't do that refined a search either without using regular expressions.  I'm not familiar with writing regex and my search involves 31 variables, so don't have time to learn regex well enough to write the proper string.  

Or, can someone tell me how to write a regex that will find Bob, or Bob's or Jones AND test1, or test2, or test3...test28; and preferably not include any word containing test1, or test 2, etc - just the specific word?  In other words, it would find an email that contained Bob and test1, or Bob and test10, or Bob's and test15 - but wouldn't return results for Bobby's and test1 or Bob's and test1234.

Who is Participating?
Terry WoodsConnect With a Mentor IT GuruCommented:
Sorry about the slow reply, but here's how it can be done:

^(?=.*\b(bob|jones)\b)(?=.*\b(tell(er)?|expensive|comp|document(ation)?|good job)\b)
Terry WoodsIT GuruCommented:
The regex pattern:

Open in new window

(with singleline mode turned on) should match or not match (*mostly) as you specify. For the code below, I've added a bit more to indicate where the first match is found.

^ means match the start of the string (in singleline mode)
(?=xyz) is a positive lookahead for xyz
\b means match the "boundary" between a word character (a character in the set [a-zA-Z0-9_] ) and a non-word character (not in that set) or no character at all.
. is a wildcard for any character
* means match any number (incl zero) of the previous character, so
(?=.*\bBob\b) means lookahead any number of characters and ensure that there exists an occurrence of Bob without another "word" character on either side. You can use this technique for multiple keywords to ensure they all exist, as shown in my pattern.

* You have an error in your specification though I think:
When searching for "test1", there should be no difference between returning a result containing test10 (which you specify as desired) and test1234 (which you specify as undesired). If there really is a difference, you'll need to explain what it is, such as "only one extra digit is ok".

Now for some code, generated from since I'm a PHP programmer:

VB.NET Code Example:
Imports System.Text.RegularExpressions
Module Module1
  Sub Main()
    Dim sourcestring as String = "replace with your source string"
    Dim re As Regex = New Regex("^(?=.*\bBob\b)(?=.*\btest1\b).*?(?:\b(Bob|test1)\b)",RegexOptions.IgnoreCase OR RegexOptions.Singleline)
    Dim mc as MatchCollection = re.Matches(sourcestring)
    Dim mIdx as Integer = 0
    For each m as Match in mc
      For groupIdx As Integer = 0 To m.Groups.Count - 1
        Console.WriteLine("[{0}][{1}] = {2}", mIdx, re.GetGroupNames(groupIdx), m.Groups(groupIdx).Value)
  End Sub
End Module

Open in new window

Play with it yourself here:

I can't be of much more help putting it into code, I'm afraid, but others in the Regular Expressions or .NET zones might be able to?
Terry WoodsIT GuruCommented:
I'm assuming you can figure out how to run VB.NET code in your system somehow, with I do know is possible with Outlook at least, and presumably also Exchange

A weakness of using \b to indicate a word boundary is that it treats _ as a word character. You can almost certainly work around this if it's a problem, but unless you want to take this further I won't go into that.

A search for Bob wouldn't find an occurrence of _Bob while you're using (?=.*\bBob\b)
Has Powershell sent you back into the Stone Age?

If managing Active Directory using Windows Powershell® is making you feel like you stepped back in time, you are not alone.  For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why.

si-supportAuthor Commented:
Thanks Terry.  I think that may be all I need to get started.  I'll play with it and post the results.
si-supportAuthor Commented:
Regarding the test1 turning up test10 but not test1234, I was just using 'testx' as an example of different words.  What I really need, for example, is to look for the word 'ball' and have it return only if it finds 'ball' specifically - not as part of another word like 'football', or 'ballroom'.

Can you show me how that would look in the regex string?

Terry WoodsIT GuruCommented:
In the part of the pattern:

The \b character after "ball" requires a word boundary for it to match, so provided ball isn't followed by an alphanumeric character or underscore, it will match.

Note that the pattern above won't work by itself; you'll still need to include that as part of a larger pattern, whether it is just:
si-supportAuthor Commented:
Thanks Terry.  I hope I'm not pressing my luck, but could you show me the expression to use in order to search multiple documents for at least Bob, or Bob's, or Jones and at least one of tell, teller, expensive, comp, document, documentation,"good job" (where it turns up only if that phrase is found exactly, not just "good" or "job") and does not pick up any other variations of the words in the list (just 'comp' but not 'complete').

I'm just not catching on to the syntax quick enough to do it in the time frame I have.

si-supportAuthor Commented:
Thanks, that's great!  With your expression and a regex cheatsheet, maybe I can better understand how this works.  In the meantime, I have a working solution to my problem.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.