Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Implement Boolean Search with Regular Expression

Posted on 2014-09-04
10
293 Views
Last Modified: 2014-09-05
Hi RegEx Experts,

I want to implement a general Boolean search (AND, OR, NOT) by using a Regular Expression. Ideally, I'd like it to support parentheses to determine precedence, but if that's too tough, I can live with evaluating NOT first, AND second, OR third. I'm fine with saying that the Boolean search cannot search for AND, OR, NOT — that is, they are always interpreted as Boolean operators, not as search strings.

I understand that "|" is OR, but I don't know how to use RegEx to implement either NOT or AND. An example will be helpful. Let's say the Boolean search is:

microsoft AND windows OR vbscript NOT unix

Using default precedence of NOT first, AND second, OR third, this would get hits that do not have "unix" but do have either (1) both "microsoft" and "windows" or (2) "vbscript". What is the RegEx equivalent search?

Likewise, if the Boolean expression allows parentheses to determine precedence, let's say the Boolean search is:

microsoft AND (windows OR vbscript) NOT unix

Using this precedence, it would get hits that do not have "unix" but do have both "microsoft" and either "windows" or "vbscript". What is the RegEx equivalent search?

Thanks, Joe
0
Comment
Question by:Joe Winograd, EE MVE
  • 2
  • 2
  • 2
  • +3
10 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 250 total points
ID: 40305101
^(?=.*microsoft)(?=.*(windows|vbscript))(?!.*unix)
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 250 total points
ID: 40305107
If you want
((microsoft AND windows) OR vbscript) AND NOT unix
that would be
^(((?=.*microsoft)(?=.*windows))|.*vbscript)(?!.*unix)
0
 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 100 total points
ID: 40305167
Joe, this can get very confusing very quickly. If you need to maintain that code, I suggest you read first about lookaheads and the fact that using alternation is not really an OR, because when you switch the terms you can get different results.
A simple example:
test string: aba
regex1: (a|b)  -> result: a
regex2: (b|a) -> result: b

HTH,
Dan
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 45

Assisted Solution

by:aikimark
aikimark earned 75 total points
ID: 40305168
for a well formed boolean expression you will need to connect the NOT unix condition with an AND operator.
0
 
LVL 75

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 75 total points
ID: 40305237
@Dan Craciun

What regex engine are you using such that that is the case (concerning alternation)?
0
 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 100 total points
ID: 40305248
Yup, I  messed up the example :)
test string: aba
regex1: (a|ab)  -> result: a
regex2: (ab|a) -> result: ab

PS: @kaufmed can you please use copy/paste for my name? Obviously Crucian means something to you (I think it's the 3rd time you spell it like that), but it's not my name. Thanks.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 40305262
Just simply reading too fast. I usually try to copy/paste for that reason  = )
0
 
LVL 53

Author Closing Comment

by:Joe Winograd, EE MVE
ID: 40306726
First, my thanks to everyone who replied! And now the details:

o  Both RegEx searches from ozo worked perfectly! Well done!

o  I appreciate the links from Dan for lookaheads and alternation.

o  Thanks to aikimark for correcting my "NOT" Boolean syntax (which was also noted by ozo in his second post).

o  Thanks to kaufmed for the comment about Dan's alternation example, resulting in Dan's update to it.

Regards, Joe
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 40306936
This is why I love E-E.  Great question, great dialog and great learning example!

Best to all, ~Ray
0
 
LVL 53

Author Comment

by:Joe Winograd, EE MVE
ID: 40306953
Agree completely, Ray! Regards, Joe
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

789 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question