Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

regex pattern not quite right....

Posted on 2003-10-23
8
Medium Priority
?
303 Views
Last Modified: 2010-04-17
I have a block of text like this:
"
just a test advert phone <!-- param1: 999 param2: 3  -->&lt;<a href="javascript:;">Secure&nbsp;Number</a>&gt; or <!-- param1: 123  param2: 3  -->&lt;&nbsp;<a href="javascript:;">Secure&nbsp;Number</a>&gt;
"

I'm attempting to grab 2 matches here, isolating 2 parameters at the same time param1 and param2
Here's my pattern:
<!-- param1:\s*(?<phone>.*)param2:\s*(?<expiry>.*)\s*-->.*Secure&nbsp;Number(</a>)?&gt;

My first match isn't ending at the first &gt; it is ending at the last one, example 1st match is:
<!-- param1: 999 param2: 3  -->&lt;<a href="javascript:;">Secure&nbsp;Number</a>&gt; or <!-- param1: 123  param2: 3  -->&lt;&nbsp;<a href="javascript:;">Secure&nbsp;Number</a>&gt;

when it should be:
<!-- param1: 999 param2: 3  -->&lt;<a href="javascript:;">Secure&nbsp;Number</a>&gt;

When I place a carriage return between the 2 parts it works fine.  How can I make the first match stop at the first &gt; ?
Thanks
0
Comment
Question by:joegass
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
8 Comments
 
LVL 9

Expert Comment

by:malharone
ID: 9609496
first of all , you'll find http://www.codeproject.com/dotnet/Expresso.asp useful
0
 
LVL 9

Expert Comment

by:malharone
ID: 9609803
and second of all ....
(</?a>)? (<!-- \s* (param\d+: \s* (\d+) \s*)+ --\> &lt?;?)+ .*? Secure? &nbsp;Number(</?a>)? &gt; (\s* or \s*)?


hope this helps
0
 
LVL 2

Author Comment

by:joegass
ID: 9641321
Sorry for my delay in getting back to you!
Thanks for the link to expresso
I'm still not having any luck with that expression, I need to name 2 groups in my regex, 1 called phone the other called expiry

<!-- param1:\s*(?<phone>.*)param2:\s*(?<expiry>.*)\s*-->.*Secure&nbsp;Number(</a>)?&gt;

The one above works OK when it is the only match on the line, but if the isn't a line break between the patterns in the text it fails to end at the first &gt; but seems to skip to the 2nd one

Thanks for your help
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 2

Author Comment

by:joegass
ID: 9641390
I think the crux of the problem is my use of .* in the middle of the expression

trying a very simple example
string = just a test advert phone Secure&nbsp;Number</a>1&gt; or Secure&nbsp;Number</a>2&gt;
pattern = (Secure&nbsp;Number){1,1}(</a>)?.*&gt;

Still returns a match of
Secure&nbsp;Number</a>1&gt; or Secure&nbsp;Number</a>2&gt;

When I'm expecting it to have "secure number" only once

I'm trying to use some form of wild card in the middle as the bit in my first example "<a href="javascript:;">" may include other parameters in it e.g. "<a href="javascript:;" onclick="window.open('/test/test/secureNumbers.htm','test','toolbar=no,location=no,status=no,menubar=no,scrollbars=yes,resizable=yes,width=400,height=450')" class="yacLink">"
These parameters being variable and optional
Looks like I'm not being specific enough
0
 
LVL 16

Accepted Solution

by:
_nn_ earned 1000 total points
ID: 9641463
From malharone's contribution, I infer that making the * meta non-greedy by postpending a ? is supported. So maybe following would work :

<!-- param1:\s*(?<phone>.*?)\s*param2:\s*(?<expiry>.*?)\s*-->.*?Secure&nbsp;Number(</a>)?&gt;

I guess, the reason why the original one works when there's a line-break is that in standard regexp the dot does not match end-of-lines, so it forced the pattern matcher to find a match in the first line only.
0
 
LVL 2

Author Comment

by:joegass
ID: 9641847
That did the trick excellent - thank you
Not too sure if I follow this non-greedy match, but it works great

Thanks to malharone for your help too
0
 
LVL 16

Expert Comment

by:_nn_
ID: 9642051
>> Not too sure if I follow this non-greedy match, but it works great

The standard behavior for the * meta character is to try to "fit as much as it can". What can be fed into depends on the preceeding character (or more precisely, class). Examples will show better I think :

regexp : "start(.*)stop"

string : "this is a start"
matched : (nothing)

string : "this is a start and this is a stop"
matched : " and this is a "
(quite normal)

string : "this is a start and this is a stop and there, just for fun, another stop"
matched : " and this is a stop and there, just for fun, another "
(the matcher took as much as it could)

string : "this is a start and this is a stop and there\n, just for fun, another stop"
matched : " and this is a "
(the matcher could not get past the \n end-of-line marker because the (.) class does not match an eol character)

Now we change to :

regexp : "start(.*?)stop"

string : "this is a start and this is a stop and there, just for fun, another stop"
matched : " and this is a "
(because we specified *? instead of * alone, the matcher stopped at the first occurence of "stop")

Hope this explains.
0
 
LVL 2

Author Comment

by:joegass
ID: 9642115
Right - makes more sense when I think of it as you describe - "fit as much as it can"
Was unaware that giving it a ? stops it at the first occurence
I'll add this to my (slowly) growing regex knowledge
Thanks very much for all your time
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question