tequilla
asked on
Regular Expression
Hello,
I have the problem that the regular expression (word1|word2|word3)? is not being recalled when later being referenced using $1.
Here a very simple example:
$context = "This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!"
$content =~ s/(simple|easy|plain|simpl itic)?.*Al ways(.*)ca ptured!/ print($1 . "\n")/iesg;
Now, I would expect this to be printing out "simple" followed by ", the second up to the last word of this sentence should be " to the console.
However, this does not happen. I used $& to look at what gets matched and it seems nothing at all.
If I take the question mark out - then the expresson matches, but the problem is, that I also want the the following variation of the above sentence to match:
"$context = "This is a very STUPID sentence. Always, the second up to the last word of this sentence should be captured!"
The reason is, that I want to capture any expressoin which matches the regular expression:
Always(.*)variations!
Now, if this regular expression is precedded by one of the words in the list, then I would like to know about it and capture/print it out.
Do you know the a regular expression to achieve this?
Thanks,
Tim
I have the problem that the regular expression (word1|word2|word3)? is not being recalled when later being referenced using $1.
Here a very simple example:
$context = "This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!"
$content =~ s/(simple|easy|plain|simpl
Now, I would expect this to be printing out "simple" followed by ", the second up to the last word of this sentence should be " to the console.
However, this does not happen. I used $& to look at what gets matched and it seems nothing at all.
If I take the question mark out - then the expresson matches, but the problem is, that I also want the the following variation of the above sentence to match:
"$context = "This is a very STUPID sentence. Always, the second up to the last word of this sentence should be captured!"
The reason is, that I want to capture any expressoin which matches the regular expression:
Always(.*)variations!
Now, if this regular expression is precedded by one of the words in the list, then I would like to know about it and capture/print it out.
Do you know the a regular expression to achieve this?
Thanks,
Tim
$1, $2,..$9 are set to the first, second,..parenthesized expression, so in you code
$1 is set to the string matching (simple|easy|plain|simplit ic)?.*
and
$2 is set to the string matching (.*)
If you want to print both, do
print("$1 - $2\n")
Also, escape with a backslash the exclamation mark as chris18 has done (\!)
$1 is set to the string matching (simple|easy|plain|simplit
and
$2 is set to the string matching (.*)
If you want to print both, do
print("$1 - $2\n")
Also, escape with a backslash the exclamation mark as chris18 has done (\!)
ASKER
@chris18
The problem with your regular expression is, that it does not capture sentences which do NOT contain one of the words in the list.
That's why I tried (word1|word2|word3)? with the question mark at the end. However, this does not work.
@lbertacco
Sorry, there is a mistake in my example. In my real example I'm using $1 and $2 of course and also not a single ! but a \!.
The problem with your regular expression is, that it does not capture sentences which do NOT contain one of the words in the list.
That's why I tried (word1|word2|word3)? with the question mark at the end. However, this does not work.
@lbertacco
Sorry, there is a mistake in my example. In my real example I'm using $1 and $2 of course and also not a single ! but a \!.
$context = "This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!"
$context =~ s/.*?[simple,easy,plain,si mplitic]+. *?Always(. *?)capture d\!/ print($1-$2 . "\n")/iesg;
$context =~ s/.*?[simple,easy,plain,si
for$context(
"This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!",
"This is a very STUPID sentence. Always, the second up to the last word of this sentence should be captured!",
){
print "$1\n$2\n" if $context =~ m/(?:.*?(simple|easy|plain |simplitic ).*?|)Alwa ys(.*?)cap tured!/;
}
;
"This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!",
"This is a very STUPID sentence. Always, the second up to the last word of this sentence should be captured!",
){
print "$1\n$2\n" if $context =~ m/(?:.*?(simple|easy|plain
}
;
print "$1$2\n" if $context =~ m/(?:(simple|easy|plain|si mplitic).* |)Always(. *?)capture d!/
ASKER
@chris18
The expression [simple,easy,plain,simplit ic]+ does NOT ONLY capture the words in the list but others. It also requires one of the words to appear at least once - which is not what I want. Once or not at all would be ok.
@ozo
I don't want another construct. Just one expression not other if statements and so on.
I think the question really is, why (word|word2|word3)? does not work?
The expression [simple,easy,plain,simplit
@ozo
I don't want another construct. Just one expression not other if statements and so on.
I think the question really is, why (word|word2|word3)? does not work?
"Once or not at all would be ok."
then this should be fine
[simple,easy,plain,simplit ic]+
then this should be fine
[simple,easy,plain,simplit
$context = "This is a very interesting sentence. Always, the second up to the last word of this sentence should be captured!";
$context =~ s/.*?(simple|easy|plain|si mplitic|[a -z]+).*?Al ways(.*)ca ptured!/ print($2 . "\n")/iesg;
$context =~ s/.*?(simple|easy|plain|si
print "$1$2\n" if $context =~ m/(?:(simple|easy|plain|si mplitic).* )?Always(. *?)capture d!/
ASKER
@chris18
[simple,easy,plain,simplit ic]+ will for example also capture:
elpmis or
ysea or
im or
pldmc
and the + signs means at least once and not not zero ore once.
(simple|easy|plain|simplit ic|[a-z]+) I have tried myself. The problem is, that now I will print out mismatching characters/words.
@ozo
I'm looking for a single regular expression /..../ which will do the job.
[simple,easy,plain,simplit
elpmis or
ysea or
im or
pldmc
and the + signs means at least once and not not zero ore once.
(simple|easy|plain|simplit
@ozo
I'm looking for a single regular expression /..../ which will do the job.
$context =~ s/(?:(simple|easy|plain|si mplitic).* )?Always(. *?)capture d\!/print( "$1 $2\n")/iesg;
the print in s//print/e will replace "simple sentence. Always, the second up to the last word of this sentence should be captured!" with "1" the (if the print succeeds) giving "This is a very 1"
Is that what you really want? I thought not, so I replaced the s with m, but the regular expression I gave will do the job you described.
Is that what you really want? I thought not, so I replaced the s with m, but the regular expression I gave will do the job you described.
#could this be what you wanted to do?
$context =~ s/(?:(simple|easy|plain|si mplitic).* )?Always(. *?)capture d\!/$1$2\n /isg;
print $context;
$context =~ s/(?:(simple|easy|plain|si
print $context;
ASKER
@ozo, @lbertacco
This already looks better. But strangely it does not work neither. I think it should, but it doesn't.
So let me try to explain the problem again using words. What I'm trying to
achieve can be summarized as follows:
1. I'm looking for a pattern (lets call it A), which I want to capture
and print out to standard output. It reocurs several times in the
text.
2. If I find the pattern A in the text, then I want to know, whether
or not a certain word (lets call it B) out of a word list (lets call
it C) preceeds pattern A.
3. In the end I want the following result: If A is found, preceeded by
a word out of C, then print A;C;. If A is found, but none of the words
out of C preceed A, the just print A;;.
For example:
"This is an example, which hopefully helps me and you to solve my
problem. I would buy a used computer for 50 Dollars but I wouldn' buy
it for 1000 Dollars. I definitely would by an apartment for 3000
Dollars or a Miro for 1500 Dollars but not for 5000000 Dollars. For 50
Dollars you can hire me as a perl programmer - but I guess I'm not
worth the Dollar:)"
I would like the following result:
50;computer;
1000;;
3000;apartment;
1500;Miro;
5000000;;
50;;
This already looks better. But strangely it does not work neither. I think it should, but it doesn't.
So let me try to explain the problem again using words. What I'm trying to
achieve can be summarized as follows:
1. I'm looking for a pattern (lets call it A), which I want to capture
and print out to standard output. It reocurs several times in the
text.
2. If I find the pattern A in the text, then I want to know, whether
or not a certain word (lets call it B) out of a word list (lets call
it C) preceeds pattern A.
3. In the end I want the following result: If A is found, preceeded by
a word out of C, then print A;C;. If A is found, but none of the words
out of C preceed A, the just print A;;.
For example:
"This is an example, which hopefully helps me and you to solve my
problem. I would buy a used computer for 50 Dollars but I wouldn' buy
it for 1000 Dollars. I definitely would by an apartment for 3000
Dollars or a Miro for 1500 Dollars but not for 5000000 Dollars. For 50
Dollars you can hire me as a perl programmer - but I guess I'm not
worth the Dollar:)"
I would like the following result:
50;computer;
1000;;
3000;apartment;
1500;Miro;
5000000;;
50;;
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
$context =~ s/.*?(simple|easy|plain|si