Avatar of shawn857
shawn857

asked on 

Help with "BRRE" Regex... search for more than 1 pattern at a time?

Hi Experts, I'm just getting my feet wet in RegEx and while googling around, found the BRRE RegEx Library for Delphi:

https://code.google.com/p/brre/

I had looked at a few others before that but once I took a look at the benchmark timing comparisons between BRRE and the others, I think that clinched things:

https://code.google.com/p/brre/wiki/Benchmark

My question is: Can BRRE search for multiple patterns in one pass of the data? It mentions that it has a parallel threaded sub-engine, so that leads me to believe that this may be possible.  If so, could anyone who is familiar with the BRRE library provide me with an example or two regarding usage?

Thanks!
    Shawn
DelphiRegular Expressions

Avatar of undefined
Last Comment
shawn857
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of shawn857
shawn857

ASKER

Thanks for the reply Ozo. Well, with what you suggest, wouldn't that find only ONE of the patterns, then stop? Using "|" is like an OR statement, isn't it? It just finds ONE of the acceptable alternatives then quits... or am I mistaken?
   What I'd like to do (if possible), is find ALL occurrences of multiple Regex's in the target text string... hopefully all in one pass. For example, I am really just searching for phone numbers. I would like to find all the 10 digit numbers, all the 11 digit numbers and all the 12 digit numbers in the target text string. Three very simple regex's like this:

\d\d\d \s \d\d\d \s \d\d\d\d              (ie. 10 digits)
\d\d\d\d \s \d\d\d \s \d\d\d\d          (ie. 11 digits)
\d\d \s \d\d\d\d\d\d \s d\d\d\d        (ie. 12 digits)

is it possible with the BRRE engine (or any engine) to "run" all these regex's against a target text string in one go? In other words, so only one pass of the data is needed? Or do I have to run Regex # 1 against the text string, get the result... then run Regex # 2 against the same text string again, get the result... and then finally run Regex # 3 against the same text string?

Thanks!
    Shawn
Avatar of kaufmed
kaufmed
Flag of United States of America image

You could try using lookaheads. Using multiple lookaheads should achieve matching all patterns in one go. You may even be able to use capture groups within the lookaheads if you need to extract items.

If you're not familiar with lookahead, I have an article on the topic:

https://www.experts-exchange.com/Programming/Languages/Regular_Expressions/A_4318-Regular-Expression-Lookaround-Demystified.html
Avatar of shawn857
shawn857

ASKER

Thanks Kaufmed... phew, that's heavy stuff. I am *just* getting my feet wet in Regex and that's a little over my head. may I ask if you've ever used the BRRE engine? It appears that it *can* handle lookahead/lookbehind (I found variables/routines with those words in the source code)... I just don't know how I'd code those into my regular expression, and how the results woulw get returned using BRRE. I guess what I'm needing is some good usage examples of BRRE... but the website for it doesn't provide too much.

Thanks
    Shawn
Avatar of ozo
ozo
Flag of United States of America image

If you can find ALL occurrences of a single Regex, then you can find  ALL occurrences of a Regex containing OR clauses.
Avatar of shawn857
shawn857

ASKER

You can Ozo?? How? I thought it only found ONE of the alternatives?

Thanks
    Shawn
Avatar of kaufmed
kaufmed
Flag of United States of America image

No, I haven't worked in Delphi. As far as I can see, the BRRE supports Perl-compatible regular expressions (PCRE), so lookahead should be in there. You basically do it like this:

(?=.*first thing)(?=.*second thing)(?=.*etc)

Open in new window


Each (?= ... ) is a lookahead. You'd probably want to anchor the pattern with a start of string ( ^ ):

^(?=.*first thing)(?=.*second thing)(?=.*etc)

Open in new window


Since you're just getting started with regex, I suggest the site:  www.regular-expressions.info. It lays everything out quite simply--even lookaheads.
Avatar of aikimark
aikimark
Flag of United States of America image

Try this
\d{2,4} \s (?:\d\d\d|\d\d\d\d\d\d) \s \d\d\d\d

Open in new window

Since \s includes the space character, what are you matching with "\s "?
Avatar of ozo
ozo
Flag of United States of America image

A match of one of the alternatives in a regex is a match of the regex,
so if you can find all occurrences of a regex, you can find all occurrences of a  regex containing alternatives.
Avatar of shawn857
shawn857

ASKER

Hey you were right Ozo, that DOES work and it finds ALL the matches... not just one or the other. Perfect!

Thanks to the others who contributed too... but Ozo had it right on the nose.

Cheers
    Shawn
Delphi
Delphi

Delphi is the most powerful Object Pascal IDE and component library for cross-platform Native App Development with flexible Cloud services and broad IoT connectivity. It provides powerful VCL controls for Windows 10 and enables FMX development for Windows, Mac and Mobile. Delphi is your choice for ultrafast Enterprise Strong Development™. Look for increased memory for large projects, extended multi-monitor support, improved Object Inspector and much more. Delphi is 5x faster for development and deployment across multiple desktop, mobile, cloud and database platforms including 32-bit and 64-bit Windows 10.

60K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo