Link to home
Start Free TrialLog in
Avatar of ghboom
ghboom

asked on

regexp help

Ok, I have been beating this one around for a few days, and just cant get it.
I need a regexp that will match the line below,

<Item name="IP Filter Disallowed" type="string">221.141.0.194 221.141.*.*</Item>

and capture everything after string">
and before </item>

my last try (with testing) was looking at the line
<Item name="IP Filter Disallowed" type="string">221.141.0.194</Item>

with using
$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3}(\.[0-9]|[*]{1,3}){3})(<\/Item>)/;

but $2 is always "" ;(

Thanks...





Thanks
Avatar of Perl_Diver
Perl_Diver

You seem to be trying to match too many bits of the string in pattern memory, if all you need is 221.141.0.194 part of the string you can do something like this:

my $FXline = '<Item name="IP Filter Disallowed" type="string">221.141.0.194</Item>';
my ($match) = $FXline =~ /<Item name="IP Filter Disallowed" type="string">([\d.]+)<\/Item>/;
print $match;
Or your parenthesis are screwing your regex up.  Try:

$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)(<\/Item>)/;

and see what $2 has then...
Avatar of ghboom

ASKER


perldiver,
the reason is Im searching for the whole line is that its part of a much bigger picture,
I only want results from THAT line ...

I just did

 ~/(<Item name="IP Filter Disallowed" type="string">)(.*)(<\/Item>)/)

and it finaly worked :)

thanks anyways ! ;)
Avatar of ozo
if $FXline can contain more than one item, it would be better to use
/(<Item name="IP Filter Disallowed" type="string">)(.*?)(<\/Item>)/
perl -MYAPE::Regex::Explain -e 'print YAPE::Regex::Explain->new(qr/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3}(\.[0-9]|[*]{1,3}){3})(<\/Item>)/)->explain'
The regular expression:

(?-imsx:(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3}(\.[0-9]|[*]{1,3}){3})(</Item>))

matches as follows:
 
NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    <Item name="IP           '<Item name="IP Filter Disallowed"
    Filter Disallowed"       type="string">'
    type="string">
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    [0-9]{1,3}               any character of: '0' to '9' (between 1
                             and 3 times (matching the most amount
                             possible))
----------------------------------------------------------------------
    (                        group and capture to \3 (3 times):
----------------------------------------------------------------------
      \.                       '.'
----------------------------------------------------------------------
      [0-9]                    any character of: '0' to '9'
----------------------------------------------------------------------
     |                        OR
----------------------------------------------------------------------
      [*]{1,3}                 any character of: '*' (between 1 and 3
                               times (matching the most amount
                               possible))
----------------------------------------------------------------------
    ){3}                     end of \3 (NOTE: because you're using a
                             quantifier on this capture, only the
                             LAST repetition of the captured pattern
                             will be stored in \3)
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  (                        group and capture to \4:
----------------------------------------------------------------------
    </Item>                  '</Item>'
----------------------------------------------------------------------
  )                        end of \4
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
maybe you meant
([0-9]{1,3}([.\s]([0-9]{1,3}|[*]))*)
although
(.*?)
may suffice
You don't need parentheses around
(<Item name="IP Filter Disallowed" type="string">)
and
(<\/Item>)
in order to match them
Avatar of ghboom

ASKER


jmcq  
on 5/17 I replied that I had found my solution....
I guess that entitles me to a refund ?

GHBoom
Well, the problem seems to have been the parenthesis, as I pointed out.  You just took what I suggested and replaced ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+) with (.*)...
Avatar of ghboom

ASKER

mjcoyne ,
with all due respect, (.*) was what I found worked without even seeing your responce.
I found the program "The Regex Coach" used it, found my solution...

With quite abit less respect,
I came here for help, how would taking your code, changing it and responding with my solution
benifit me in any way ?
I didnt bother to refresh the page after reading Perl_Divers solution.
Next time think about all the possibilitied before, implying someone stole your idea and
reducing your reputation.

GHBoom


Your question was, essentially,

[Why] $2 is always ""

My solution gave you the answer to that question -- $2 was "" because you had your capturing parenthesis screwed up.  Perl_Diver's response did not solve your problem, as you acknowledged in your response to him.

You went from "$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3}(\.[0-9]|[*]{1,3}){3})(<\/Item>)/;", in which $2 was null, to "$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)(.*)(<\/Item>)/;", in which $2 works, because you've re-arranged your parenthesis as I suggested.

I couldn't care less about the points -- give them to Perl_Diver or ozo, if you like -- and I have no worries about my reputation, thanks.  I just think asking for a refund, given the circumstances, is a bit much.

No worries, whatever the admin feels is appropriate is okay with me.
ASKER CERTIFIED SOLUTION
Avatar of Netminder
Netminder

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial