[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 933
  • Last Modified:

Perl Regex to extract word pairs and tuples?

Hi all, I need some help in fixing my regex which clearly doesnt work.

Take the following string:
"This is a string >>>>> And another string"

I want to  be able to take word pairs (I have another method that needs triple word groups) so that I ended up with matches like:

This is
is a
a string
And another
another string

For the triple then,
This is a
is a string
And another string

Any ideas how I do that?  I want to avoid using loops to process this though!
0
Slimshaneey
Asked:
Slimshaneey
1 Solution
 
ozoCommented:
$_="This is a string >>>>> And another string";
print "$1 $2\n" while /(\w+)\W+(?=(\w+))/g;
print "$1 $2 $3\n" while /(\w+)(?=\W+(\w+)\W+(\w+))/g;
0
 
ozoCommented:
#avoiding explicit loops
print map{$_||"\n"} /(\w+\s)\W*(?=(\w+)())/g;
0
 
SlimshaneeyAuthor Commented:
Ozo - That doesnt seem to work for me. I keep ending up with 3 separate arrays containing only single word results when I use preg_match_all in PHP with that pattern?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
ozoCommented:
Those were Perl statements, and no arrays were involved.
What were you doing in PHP to create arrays?

#!/usr/bin/perl
#avoiding loops
$_="This is a string >>>>> And another string";
s/(\w+)\W+(?=(\w+))/$1 $2\n/g;
print;

$_="This is a string >>>>> And another string";
s/(\w+)\W+(?=(\w+)\W+(\w+))/$1 $2 $3\n/g;
print;
0
 
ozoCommented:
Sorry, I just noticed that you don't want

string And
or
a string And
or
string And another

In that case, I might create arrays in Perl with

$_="This is a string >>>>> And another string";
@pairs = /(?=(\w+\s+\w+))\w+/g;
@triples = /(?=(\w+\s+\w+\s+\w+))\w+/g;

but I don't know how you want to create arrays in PHP.
0
 
Terry WoodsIT GuruCommented:
PHP Solution:

<?

$string = "This is a string >>>>> And another string";
$sizes_to_try = array(2,3);

foreach ($sizes_to_try as $word_group_size) {
  print "Trying word group size $word_group_size\n";
  preg_match_all("/\b(?:\b\w+(?:\s+|(?!\s+\w)))(?=((?:\b\w+(?:\s+|(?!\s+\w))){".($word_group_si
ze-1)."}))/",$string,$matches);
  foreach ($matches[0] as $num=>$value) {
    print "{$value}{$matches[1][$num]}\n";
  }
}

Open in new window


Output:
Trying word group size 2
This is
is a
a string
And another
another string
Trying word group size 3
This is a
is a string
And another string

Open in new window

0
 
Ray PaseurCommented:
I want to avoid using loops
Why?  It's probably easier that way, and it may be faster than using the REGEX engine.  If you're doing this only a few thousand times, it's not worth studying, but if it's a frequent algorithm it might be worthy of investigation instead of just assuming that the loop would take longer.

See http://www.laprbass.com/RAY_temp_slimshaney.php
<?php // RAY_temp_slimshaney.php
error_reporting(E_ALL);
echo "<pre>";

// COPIED / MODIFIED FROM THE POST AT EE
$str = "This is a string And another string Ding";

// PROCESS THE STRING AS AN ARRAY
$arr = explode(' ', $str);

// IF THERE IS STILL DATA IN THE ARRAY
while ($arr)
{
    // TAKE THE FIRST ELEMENT OFF THE ARRAY
    $sub   = array_shift($arr);

    // CONCATENATE THE NEXT ELEMENT
    $sub  .= ' ' . current($arr);

    // SAVE THE WORD PAIR
    $out[] = $sub;
}

// DISCARD THE LAST ELEMENT
array_pop($out);

// SHOW THE INPUT AND THE WORK PRODUCT
var_dump($str);
var_dump($out);

Open in new window

I think a similar pattern could find triple-word groups, too.  It's not clear from the three-word example in the original post what the expected output should really be.

HTH, ~Ray
0
 
SlimshaneeyAuthor Commented:
This worked exactly as reqiured, I'd give bonus marks for the neatness of the single regex for 2 and 3 word combos if I could! Many thanks
Shane
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now