ghboom
asked on
regexp help
Ok, I have been beating this one around for a few days, and just cant get it.
I need a regexp that will match the line below,
<Item name="IP Filter Disallowed" type="string">221.141.0.19 4 221.141.*.*</Item>
and capture everything after string">
and before </item>
my last try (with testing) was looking at the line
<Item name="IP Filter Disallowed" type="string">221.141.0.19 4</Item>
with using
$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3} (\.[0-9]|[ *]{1,3}){3 })(<\/Item >)/;
but $2 is always "" ;(
Thanks...
Thanks
I need a regexp that will match the line below,
<Item name="IP Filter Disallowed" type="string">221.141.0.19
and capture everything after string">
and before </item>
my last try (with testing) was looking at the line
<Item name="IP Filter Disallowed" type="string">221.141.0.19
with using
$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3}
but $2 is always "" ;(
Thanks...
Thanks
Or your parenthesis are screwing your regex up. Try:
$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]+\.[0 -9]+\.[0-9 ]+\.[0-9]+ )(<\/Item> )/;
and see what $2 has then...
$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]+\.[0
and see what $2 has then...
ASKER
perldiver,
the reason is Im searching for the whole line is that its part of a much bigger picture,
I only want results from THAT line ...
I just did
~/(<Item name="IP Filter Disallowed" type="string">)(.*)(<\/Ite
and it finaly worked :)
thanks anyways ! ;)
if $FXline can contain more than one item, it would be better to use
/(<Item name="IP Filter Disallowed" type="string">)(.*?)(<\/It em>)/
/(<Item name="IP Filter Disallowed" type="string">)(.*?)(<\/It
perl -MYAPE::Regex::Explain -e 'print YAPE::Regex::Explain->new( qr/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3} (\.[0-9]|[ *]{1,3}){3 })(<\/Item >)/)->expl ain'
The regular expression:
(?-imsx:(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3} (\.[0-9]|[ *]{1,3}){3 })(</Item> ))
matches as follows:
NODE EXPLANATION
-------------------------- ---------- ---------- ---------- ---------- ----
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
-------------------------- ---------- ---------- ---------- ---------- ----
( group and capture to \1:
-------------------------- ---------- ---------- ---------- ---------- ----
<Item name="IP '<Item name="IP Filter Disallowed"
Filter Disallowed" type="string">'
type="string">
-------------------------- ---------- ---------- ---------- ---------- ----
) end of \1
-------------------------- ---------- ---------- ---------- ---------- ----
( group and capture to \2:
-------------------------- ---------- ---------- ---------- ---------- ----
[0-9]{1,3} any character of: '0' to '9' (between 1
and 3 times (matching the most amount
possible))
-------------------------- ---------- ---------- ---------- ---------- ----
( group and capture to \3 (3 times):
-------------------------- ---------- ---------- ---------- ---------- ----
\. '.'
-------------------------- ---------- ---------- ---------- ---------- ----
[0-9] any character of: '0' to '9'
-------------------------- ---------- ---------- ---------- ---------- ----
| OR
-------------------------- ---------- ---------- ---------- ---------- ----
[*]{1,3} any character of: '*' (between 1 and 3
times (matching the most amount
possible))
-------------------------- ---------- ---------- ---------- ---------- ----
){3} end of \3 (NOTE: because you're using a
quantifier on this capture, only the
LAST repetition of the captured pattern
will be stored in \3)
-------------------------- ---------- ---------- ---------- ---------- ----
) end of \2
-------------------------- ---------- ---------- ---------- ---------- ----
( group and capture to \4:
-------------------------- ---------- ---------- ---------- ---------- ----
</Item> '</Item>'
-------------------------- ---------- ---------- ---------- ---------- ----
) end of \4
-------------------------- ---------- ---------- ---------- ---------- ----
) end of grouping
-------------------------- ---------- ---------- ---------- ---------- ----
The regular expression:
(?-imsx:(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3}
matches as follows:
NODE EXPLANATION
--------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
--------------------------
( group and capture to \1:
--------------------------
<Item name="IP '<Item name="IP Filter Disallowed"
Filter Disallowed" type="string">'
type="string">
--------------------------
) end of \1
--------------------------
( group and capture to \2:
--------------------------
[0-9]{1,3} any character of: '0' to '9' (between 1
and 3 times (matching the most amount
possible))
--------------------------
( group and capture to \3 (3 times):
--------------------------
\. '.'
--------------------------
[0-9] any character of: '0' to '9'
--------------------------
| OR
--------------------------
[*]{1,3} any character of: '*' (between 1 and 3
times (matching the most amount
possible))
--------------------------
){3} end of \3 (NOTE: because you're using a
quantifier on this capture, only the
LAST repetition of the captured pattern
will be stored in \3)
--------------------------
) end of \2
--------------------------
( group and capture to \4:
--------------------------
</Item> '</Item>'
--------------------------
) end of \4
--------------------------
) end of grouping
--------------------------
maybe you meant
([0-9]{1,3}([.\s]([0-9]{1, 3}|[*]))*)
although
(.*?)
may suffice
You don't need parentheses around
(<Item name="IP Filter Disallowed" type="string">)
and
(<\/Item>)
in order to match them
([0-9]{1,3}([.\s]([0-9]{1,
although
(.*?)
may suffice
You don't need parentheses around
(<Item name="IP Filter Disallowed" type="string">)
and
(<\/Item>)
in order to match them
ASKER
jmcq
on 5/17 I replied that I had found my solution....
I guess that entitles me to a refund ?
GHBoom
Well, the problem seems to have been the parenthesis, as I pointed out. You just took what I suggested and replaced ([0-9]+\.[0-9]+\.[0-9]+\.[ 0-9]+) with (.*)...
ASKER
mjcoyne ,
with all due respect, (.*) was what I found worked without even seeing your responce.
I found the program "The Regex Coach" used it, found my solution...
With quite abit less respect,
I came here for help, how would taking your code, changing it and responding with my solution
benifit me in any way ?
I didnt bother to refresh the page after reading Perl_Divers solution.
Next time think about all the possibilitied before, implying someone stole your idea and
reducing your reputation.
GHBoom
with all due respect, (.*) was what I found worked without even seeing your responce.
I found the program "The Regex Coach" used it, found my solution...
With quite abit less respect,
I came here for help, how would taking your code, changing it and responding with my solution
benifit me in any way ?
I didnt bother to refresh the page after reading Perl_Divers solution.
Next time think about all the possibilitied before, implying someone stole your idea and
reducing your reputation.
GHBoom
Your question was, essentially,
[Why] $2 is always ""
My solution gave you the answer to that question -- $2 was "" because you had your capturing parenthesis screwed up. Perl_Diver's response did not solve your problem, as you acknowledged in your response to him.
You went from "$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3} (\.[0-9]|[ *]{1,3}){3 })(<\/Item >)/;", in which $2 was null, to "$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)(.*)(<\/Ite m>)/;", in which $2 works, because you've re-arranged your parenthesis as I suggested.
I couldn't care less about the points -- give them to Perl_Diver or ozo, if you like -- and I have no worries about my reputation, thanks. I just think asking for a refund, given the circumstances, is a bit much.
No worries, whatever the admin feels is appropriate is okay with me.
[Why] $2 is always ""
My solution gave you the answer to that question -- $2 was "" because you had your capturing parenthesis screwed up. Perl_Diver's response did not solve your problem, as you acknowledged in your response to him.
You went from "$FXline = ~/(<Item name="IP Filter Disallowed" type="string">)([0-9]{1,3}
I couldn't care less about the points -- give them to Perl_Diver or ozo, if you like -- and I have no worries about my reputation, thanks. I just think asking for a refund, given the circumstances, is a bit much.
No worries, whatever the admin feels is appropriate is okay with me.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
my $FXline = '<Item name="IP Filter Disallowed" type="string">221.141.0.19
my ($match) = $FXline =~ /<Item name="IP Filter Disallowed" type="string">([\d.]+)<\/I
print $match;