asked on

Regular Expression

Hello,

I have the problem that the regular expression (word1|word2|word3)? is not being recalled when later being referenced using $1.

Here a very simple example:

$context = "This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!"

$content =~ s/(simple|easy|plain|simplitic)?.*Always(.*)captured!/ print($1 . "\n")/iesg;

Now, I would expect this to be printing out "simple" followed by ", the second up to the last word of this sentence should be " to the console.

However, this does not happen. I used $& to look at what gets matched and it seems nothing at all.

If I take the question mark out - then the expresson matches, but the problem is, that I also want the the following variation of the above sentence to match:

"$context = "This is a very STUPID sentence. Always, the second up to the last word of this sentence should be captured!"

The reason is, that I want to capture any expressoin which matches the regular expression:

Always(.*)variations!

Now, if this regular expression is precedded by one of the words in the list, then I would like to know about it and capture/print it out.

Do you know the a regular expression to achieve this?

Thanks,
Tim

Chris S

$context = "This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!"

$context =~ s/.*?(simple|easy|plain|simplitic).*?Always(.*?)captured\!/ print($2 . "\n")/iesg;

lbertacco

$1, $2,..$9 are set to the first, second,..parenthesized expression, so in you code
$1 is set to the string matching (simple|easy|plain|simplitic)?.*
and
$2 is set to the string matching (.*)

If you want to print both, do
print("$1 - $2\n")

Also, escape with a backslash the exclamation mark as chris18 has done (\!)

tequilla

ASKER

@chris18
The problem with your regular expression is, that it does not capture sentences which do NOT contain one of the words in the list.

That's why I tried (word1|word2|word3)? with the question mark at the end. However, this does not work.

@lbertacco
Sorry, there is a mistake in my example. In my real example I'm using $1 and $2 of course and also not a single ! but a \!.

Chris S

$context = "This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!"

$context =~ s/.*?[simple,easy,plain,simplitic]+.*?Always(.*?)captured\!/ print($1-$2 . "\n")/iesg;

ozo

for$context(
"This is a very simple sentence. Always, the second up to the last word of this sentence should be captured!",
"This is a very STUPID sentence. Always, the second up to the last word of this sentence should be captured!",
){
print "$1\n$2\n" if $context =~ m/(?:.*?(simple|easy|plain|simplitic).*?|)Always(.*?)captured!/;
}

;

ozo

print "$1$2\n" if $context =~ m/(?:(simple|easy|plain|simplitic).*|)Always(.*?)captured!/

tequilla

ASKER

@chris18

The expression [simple,easy,plain,simplitic]+ does NOT ONLY capture the words in the list but others. It also requires one of the words to appear at least once - which is not what I want. Once or not at all would be ok.

@ozo
I don't want another construct. Just one expression not other if statements and so on.

I think the question really is, why (word|word2|word3)? does not work?

Chris S

"Once or not at all would be ok."

then this should be fine

[simple,easy,plain,simplitic]+

Chris S

$context = "This is a very interesting sentence. Always, the second up to the last word of this sentence should be captured!";

$context =~ s/.*?(simple|easy|plain|simplitic|[a-z]+).*?Always(.*)captured!/ print($2 . "\n")/iesg;

ozo

print "$1$2\n" if $context =~ m/(?:(simple|easy|plain|simplitic).*)?Always(.*?)captured!/

tequilla

ASKER

@chris18

[simple,easy,plain,simplitic]+ will for example also capture:

elpmis or
ysea or
im or
pldmc

and the + signs means at least once and not not zero ore once.

(simple|easy|plain|simplitic|[a-z]+) I have tried myself. The problem is, that now I will print out mismatching characters/words.

@ozo
I'm looking for a single regular expression /..../ which will do the job.

lbertacco

$context =~ s/(?:(simple|easy|plain|simplitic).*)?Always(.*?)captured\!/print("$1 $2\n")/iesg;

ozo

the print in s//print/e will replace "simple sentence. Always, the second up to the last word of this sentence should be captured!" with "1" the (if the print succeeds) giving "This is a very 1"
Is that what you really want? I thought not, so I replaced the s with m, but the regular expression I gave will do the job you described.

ozo

#could this be what you wanted to do?
$context =~ s/(?:(simple|easy|plain|simplitic).*)?Always(.*?)captured\!/$1$2\n/isg;
print $context;

tequilla

ASKER

@ozo, @lbertacco

This already looks better. But strangely it does not work neither. I think it should, but it doesn't.

So let me try to explain the problem again using words. What I'm trying to
achieve can be summarized as follows:

1. I'm looking for a pattern (lets call it A), which I want to capture
and print out to standard output. It reocurs several times in the
text.

2. If I find the pattern A in the text, then I want to know, whether
or not a certain word (lets call it B) out of a word list (lets call
it C) preceeds pattern A.

3. In the end I want the following result: If A is found, preceeded by
a word out of C, then print A;C;. If A is found, but none of the words
out of C preceed A, the just print A;;.

For example:

"This is an example, which hopefully helps me and you to solve my
problem. I would buy a used computer for 50 Dollars but I wouldn' buy
it for 1000 Dollars. I definitely would by an apartment for 3000
Dollars or a Miro for 1500 Dollars but not for 5000000 Dollars. For 50
Dollars you can hire me as a perl programmer - but I guess I'm not
worth the Dollar:)"

I would like the following result:

50;computer;
1000;;
3000;apartment;
1500;Miro;
5000000;;
50;;

ASKER CERTIFIED SOLUTION

ozo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

lbertacco

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial