Solved

PHP RegEx extraction

Posted on 2013-11-11
7
300 Views
Last Modified: 2013-11-25
Given the following text snippet, I need some help in extracting the parameters:

$text = <<<EOT
[someothertag:no:thanks]
[mytag:charlie:doug]
[mytag:julie:ivy]
EOT;

I want to extract the fields from the mytag lines resulting in an array something like:

array(
 [0] => array('charlie', 'doug'),
 [1] => array('julie', 'ivy')
)

I thought preg_match might do this, but not had any joy.  Can anyone help?

Thanks
BT
0
Comment
Question by:brothertom
7 Comments
 
LVL 22

Expert Comment

by:Ivo Stoykov
ID: 39638182
try this:

<?php
$text = <<<EOT
[someothertag:no:thanks]
[mytag:charlie:doug]
[mytag:julie:ivy]
EOT;

$one = explode("][", str_replace(PHP_EOL, "", $text));
print_r($one);
$arr = array();
foreach ($one as $s){
  $s = str_replace("[", "", $s);
  $s = str_replace("]", "", $s);
  $t = explode(":", $s);
  $arr[count($arr)] = array_slice($t, 1);
}
print_r($arr);

?>

Open in new window

HTH

Ivo Stoykov
0
 

Author Comment

by:brothertom
ID: 39638218
Ah, sorry I should have mentioned, the [..] tags can be anywhere in the text so won't often get ][ next to each other.

It's actually a snippet from a Wordpress page, so something like (nonsense example)

$text = <<<EOT
<h1>Page title</h1>
[someothertag:no:thanks]
<p>
[mytag:charlie:doug]
some more random text
[mytag:julie:ivy]
and some trailing text
EOT;
0
 
LVL 12

Accepted Solution

by:
zappafan2k2 earned 300 total points
ID: 39638556
preg_match_all() should work just fine.  You didn't show us what you've tried, so I can't comment on why it didn't work for you.
$text = <<<EOT
<h1>Page title</h1>
[someothertag:no:thanks]
<p>
[mytag:charlie:doug]
some more random text
[mytag:julie:ivy]
and some trailing text
EOT;

preg_match_all('/\[([^\]]+)\]/', $text, $matches);
print_r($matches);

Open in new window

yields
Array
(
    [0] => Array
        (
            [0] => [someothertag:no:thanks]
            [1] => [mytag:charlie:doug]
            [2] => [mytag:julie:ivy]
        )

    [1] => Array
        (
            [0] => someothertag:no:thanks
            [1] => mytag:charlie:doug
            [2] => mytag:julie:ivy
        )

)

Open in new window

So you will want to look at $matches[1].  From there, you can use preg_split() to pull the tags out.
$data = array();
foreach($matches[1] as $tags) {
    $tag = preg_split('/:/', $tags);  
    $mytag = array_shift($tag); // if you will always have mytag: first
    $data[] = $tag;
}
print_r($data);

Open in new window

yields
Array
(
    [0] => Array
        (
            [0] => no
            [1] => thanks
        )

    [1] => Array
        (
            [0] => charlie
            [1] => doug
        )

    [2] => Array
        (
            [0] => julie
            [1] => ivy
        )

)

Open in new window

0
Revamp Your Training Process

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action.

 
LVL 110

Assisted Solution

by:Ray Paseur
Ray Paseur earned 100 total points
ID: 39638653
As is true of almost every programming question ever asked, the quality and variety of responses is directly related to the quality and variety of the test data.

Please see http://www.laprbass.com/RAY_temp_brothertom.php

No regular expressions are needed at all.  It's simple string/array processing!

<?php // RAY_temp_brothertom.php
error_reporting(E_ALL);

// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28290580.html
$text = <<<EOT
<h1>Page title</h1>
[someothertag:no:thanks]
<p>
[mytag:charlie:doug]
some more random text
[mytag:julie:ivy]
and some trailing text
EOT;


// ISOLATE THE ELEMENTS BY USING THE TAG NAME
$arr = explode('mytag:', $text);
unset($arr[0]);

// ITERATE OVER THE ISOLATED ELEMENTS
foreach ($arr as $str)
{
    // KEEP ONLY THE PART TO THE LEFT OF THE CLOSING BRACKET
    $str = substr($str, 0, strpos($str, ']'));

    // PRODUCE THE ARRAY OF NAMES
    $out[] = explode(':', $str);
}

// SHOW THE WORK PRODUCT
echo '<pre>';
print_r($out);

Open in new window

Best regards, ~Ray
0
 
LVL 82

Assisted Solution

by:hielo
hielo earned 100 total points
ID: 39639042
Here you go:
$text = <<<EOT
<h1>Page title</h1>
[someothertag:no:thanks]
<p>
[mytag:charlie:doug]
some more random text
[mytag:julie:ivy]
and some trailing text
EOT;

preg_match_all('#\x5Bmytag:(?:((?:\x5C.|[^:])*):((?:\x5C.|[^\x5D])*))\x5D#', $text, $matches);
$matches=array_combine($matches[1],$matches[2]);

echo '<pre>',print_r($matches,true),'</pre>';
exit;

Open in new window

0
 

Author Closing Comment

by:brothertom
ID: 39676000
Thank you all - also, thanks Ray for a very interesting article on test data.
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39676278
Thanks for the points and thanks for using EE, ~Ray
0

Featured Post

Revamp Your Training Process

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
This article discusses how to implement server side field validation and display customized error messages to the client.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question