Can you determine Success/Failure of Back-Referenced Groups without matching the text?

Hi,

Is there a way to determine if a back-referenced group captured successfully, but WITHOUT trying to match the actual text?

For instance, an email address can be written two ways:
larry.smith@yahoo.com

AND

Larry Q. Smith III <larry.smith@yahoo.com>

What I'm hoping to do is put an Optional Group around the
"Larry Q. Smith III <" part and another Optional Group around the last character ">".

The problem is that I don't want to capture things like:
Larry Q. Smith III <larry.smith@yahoo.com

AND

larry.smith@yahoo.com>

--- I was hoping for a way to determine if the first group matched successfully, and if that's true, then the last > MUST be there.

I was hoping NOT to use alteration as this can be very expensive with backtracking.

I'd appreciate any help anyone has to offer.

Thank you!
LDawggieAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Terry WoodsIT GuruCommented:
> I was hoping NOT to use alternation as this can be very expensive with backtracking.

The first thing that comes to mind is: is this really a problem? Unless you are pushing the limits of performance, or some other performance related bottleneck or cost, the labour incurred in finding a faster solution can often cost more than just running a sub-optimal pattern.

However, I'll have a think about the actual question now...
0
Terry WoodsIT GuruCommented:
By the way, can you please share which language/tool you are going to run the regex on? This makes a difference as to whether it might be possible.
0
Terry WoodsIT GuruCommented:
This seems to work for me:
([^<]*<)?(\w+\.)*\w+@\w+(\.[a-z]{2,8}){1,2}(?(1)>|)

The (?(1)>|) means match > if the first captured subpattern matched, otherwise match nothing.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

Terry WoodsIT GuruCommented:
My pattern for the email address is this part:
(\w+\.)*\w+@\w+(\.[a-z]{2,8}){1,2}

You may have a better pattern for that bit, as I threw mine together in a hurry
0
käµfm³d 👽Commented:
If your language supports it, you might throw the first part of Terry's pattern inside of an atomic group to mitigate backtracking:

(?>([^<]*<)?)(\w+\.)*\w+@\w+(\.[a-z]{2,8}){1,2}(?(1)>|)

Open in new window

0
LDawggieAuthor Commented:
Thank you Terry for reminding me of the (?(1)then|else) syntax.
I must have missed that in the Regular Expressions book. :)

Thank you Kaufmed for your advice about the atomic grouping. That's a good idea!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Regular Expressions

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.