Slimshaneey
asked on
Perl Regex to extract word pairs and tuples?
Hi all, I need some help in fixing my regex which clearly doesnt work.
Take the following string:
"This is a string >>>>> And another string"
I want to be able to take word pairs (I have another method that needs triple word groups) so that I ended up with matches like:
This is
is a
a string
And another
another string
For the triple then,
This is a
is a string
And another string
Any ideas how I do that? I want to avoid using loops to process this though!
Take the following string:
"This is a string >>>>> And another string"
I want to be able to take word pairs (I have another method that needs triple word groups) so that I ended up with matches like:
This is
is a
a string
And another
another string
For the triple then,
This is a
is a string
And another string
Any ideas how I do that? I want to avoid using loops to process this though!
#avoiding explicit loops
print map{$_||"\n"} /(\w+\s)\W*(?=(\w+)())/g;
print map{$_||"\n"} /(\w+\s)\W*(?=(\w+)())/g;
ASKER
Ozo - That doesnt seem to work for me. I keep ending up with 3 separate arrays containing only single word results when I use preg_match_all in PHP with that pattern?
Those were Perl statements, and no arrays were involved.
What were you doing in PHP to create arrays?
#!/usr/bin/perl
#avoiding loops
$_="This is a string >>>>> And another string";
s/(\w+)\W+(?=(\w+))/$1 $2\n/g;
print;
$_="This is a string >>>>> And another string";
s/(\w+)\W+(?=(\w+)\W+(\w+) )/$1 $2 $3\n/g;
print;
What were you doing in PHP to create arrays?
#!/usr/bin/perl
#avoiding loops
$_="This is a string >>>>> And another string";
s/(\w+)\W+(?=(\w+))/$1 $2\n/g;
print;
$_="This is a string >>>>> And another string";
s/(\w+)\W+(?=(\w+)\W+(\w+)
print;
Sorry, I just noticed that you don't want
string And
or
a string And
or
string And another
In that case, I might create arrays in Perl with
$_="This is a string >>>>> And another string";
@pairs = /(?=(\w+\s+\w+))\w+/g;
@triples = /(?=(\w+\s+\w+\s+\w+))\w+/ g;
but I don't know how you want to create arrays in PHP.
string And
or
a string And
or
string And another
In that case, I might create arrays in Perl with
$_="This is a string >>>>> And another string";
@pairs = /(?=(\w+\s+\w+))\w+/g;
@triples = /(?=(\w+\s+\w+\s+\w+))\w+/
but I don't know how you want to create arrays in PHP.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I want to avoid using loopsWhy? It's probably easier that way, and it may be faster than using the REGEX engine. If you're doing this only a few thousand times, it's not worth studying, but if it's a frequent algorithm it might be worthy of investigation instead of just assuming that the loop would take longer.
See http://www.laprbass.com/RAY_temp_slimshaney.php
<?php // RAY_temp_slimshaney.php
error_reporting(E_ALL);
echo "<pre>";
// COPIED / MODIFIED FROM THE POST AT EE
$str = "This is a string And another string Ding";
// PROCESS THE STRING AS AN ARRAY
$arr = explode(' ', $str);
// IF THERE IS STILL DATA IN THE ARRAY
while ($arr)
{
// TAKE THE FIRST ELEMENT OFF THE ARRAY
$sub = array_shift($arr);
// CONCATENATE THE NEXT ELEMENT
$sub .= ' ' . current($arr);
// SAVE THE WORD PAIR
$out[] = $sub;
}
// DISCARD THE LAST ELEMENT
array_pop($out);
// SHOW THE INPUT AND THE WORK PRODUCT
var_dump($str);
var_dump($out);
I think a similar pattern could find triple-word groups, too. It's not clear from the three-word example in the original post what the expected output should really be.HTH, ~Ray
ASKER
This worked exactly as reqiured, I'd give bonus marks for the neatness of the single regex for 2 and 3 word combos if I could! Many thanks
Shane
Shane
print "$1 $2\n" while /(\w+)\W+(?=(\w+))/g;
print "$1 $2 $3\n" while /(\w+)(?=\W+(\w+)\W+(\w+))