Solved

Help with "BRRE" Regex... search for more than 1 pattern at a time?

Posted on 2014-11-27
11
181 Views
Last Modified: 2014-11-29
Hi Experts, I'm just getting my feet wet in RegEx and while googling around, found the BRRE RegEx Library for Delphi:

https://code.google.com/p/brre/

I had looked at a few others before that but once I took a look at the benchmark timing comparisons between BRRE and the others, I think that clinched things:

https://code.google.com/p/brre/wiki/Benchmark

My question is: Can BRRE search for multiple patterns in one pass of the data? It mentions that it has a parallel threaded sub-engine, so that leads me to believe that this may be possible.  If so, could anyone who is familiar with the BRRE library provide me with an example or two regarding usage?

Thanks!
    Shawn
0
Comment
Question by:shawn857
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 40470878
Do you mean like searching for /first pattern|second pattern|third pattern/ ?
0
 

Author Comment

by:shawn857
ID: 40470888
Thanks for the reply Ozo. Well, with what you suggest, wouldn't that find only ONE of the patterns, then stop? Using "|" is like an OR statement, isn't it? It just finds ONE of the acceptable alternatives then quits... or am I mistaken?
   What I'd like to do (if possible), is find ALL occurrences of multiple Regex's in the target text string... hopefully all in one pass. For example, I am really just searching for phone numbers. I would like to find all the 10 digit numbers, all the 11 digit numbers and all the 12 digit numbers in the target text string. Three very simple regex's like this:

\d\d\d \s \d\d\d \s \d\d\d\d              (ie. 10 digits)
\d\d\d\d \s \d\d\d \s \d\d\d\d          (ie. 11 digits)
\d\d \s \d\d\d\d\d\d \s d\d\d\d        (ie. 12 digits)

is it possible with the BRRE engine (or any engine) to "run" all these regex's against a target text string in one go? In other words, so only one pass of the data is needed? Or do I have to run Regex # 1 against the text string, get the result... then run Regex # 2 against the same text string again, get the result... and then finally run Regex # 3 against the same text string?

Thanks!
    Shawn
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 40470892
You could try using lookaheads. Using multiple lookaheads should achieve matching all patterns in one go. You may even be able to use capture groups within the lookaheads if you need to extract items.

If you're not familiar with lookahead, I have an article on the topic:

http://www.experts-exchange.com/Programming/Languages/Regular_Expressions/A_4318-Regular-Expression-Lookaround-Demystified.html
0
 

Author Comment

by:shawn857
ID: 40470919
Thanks Kaufmed... phew, that's heavy stuff. I am *just* getting my feet wet in Regex and that's a little over my head. may I ask if you've ever used the BRRE engine? It appears that it *can* handle lookahead/lookbehind (I found variables/routines with those words in the source code)... I just don't know how I'd code those into my regular expression, and how the results woulw get returned using BRRE. I guess what I'm needing is some good usage examples of BRRE... but the website for it doesn't provide too much.

Thanks
    Shawn
0
 
LVL 84

Expert Comment

by:ozo
ID: 40470921
If you can find ALL occurrences of a single Regex, then you can find  ALL occurrences of a Regex containing OR clauses.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:shawn857
ID: 40470928
You can Ozo?? How? I thought it only found ONE of the alternatives?

Thanks
    Shawn
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 40470952
No, I haven't worked in Delphi. As far as I can see, the BRRE supports Perl-compatible regular expressions (PCRE), so lookahead should be in there. You basically do it like this:

(?=.*first thing)(?=.*second thing)(?=.*etc)

Open in new window


Each (?= ... ) is a lookahead. You'd probably want to anchor the pattern with a start of string ( ^ ):

^(?=.*first thing)(?=.*second thing)(?=.*etc)

Open in new window


Since you're just getting started with regex, I suggest the site:  www.regular-expressions.info. It lays everything out quite simply--even lookaheads.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40471043
Try this
\d{2,4} \s (?:\d\d\d|\d\d\d\d\d\d) \s \d\d\d\d

Open in new window

Since \s includes the space character, what are you matching with "\s "?
0
 
LVL 84

Expert Comment

by:ozo
ID: 40471239
A match of one of the alternatives in a regex is a match of the regex,
so if you can find all occurrences of a regex, you can find all occurrences of a  regex containing alternatives.
0
 

Author Closing Comment

by:shawn857
ID: 40471889
Hey you were right Ozo, that DOES work and it finds ALL the matches... not just one or the other. Perfect!

Thanks to the others who contributed too... but Ozo had it right on the nose.

Cheers
    Shawn
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now