Link to home
Start Free TrialLog in
Avatar of AlivewithTechnology
AlivewithTechnology

asked on

Need a regular expression for preg_match

Using PHP I want to search through a string and build an array of results of text that is between {% and %}

For example, I have a string that's a template of a page and will contain things like {%mainBody%} and {%header%}

I want to be able to run through the string and return to an array things like just mainBody and header.

I have no prior experience with regular expressions and although I've searched google I'm having no look finding what I need.
Avatar of kebabs
kebabs
Flag of Australia image

Crude but it works. Replace $array and $html with your variable names.
preg_match('/{%([^%]+)%}/e', '$array[\'\\1\']', $html);

Open in new window

That's supposed to be preg_replace... sorry.
preg_replace('/{%([^%]+)%}/e', '$array[\'\\1\']', $html);

Open in new window

The attached script outputs ...

array(1) {
  [0]=>
  array(3) {
    [0]=>
    string(9) "{% and %}"
    [1]=>
    string(12) "{%mainBody%}"
    [2]=>
    string(10) "{%header%}"
  }
}
preg_match_all
	(
	'`{%(?!.%}).*?%}`sim',
	file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),
	$a_Matches
	);
var_dump($a_Matches);
?>

Open in new window

If there are only supposed to be letters (no numbers, spaces, etc), then the new output is ...

array(1) {
  [0]=>
  array(4) {
    [0]=>
    string(12) "{%mainBody%}"
    [1]=>
    string(10) "{%header%}"
    [2]=>
    string(12) "{%mainBody%}"
    [3]=>
    string(10) "{%header%}"
  }
}

As I'm looking at THIS page as my source, every time I add a {%xxxx%} it will be found by the code, so really, just rely on the regex.
<?php
preg_match_all
	(
	'`{%(?!.%})[a-z]*?%}`sim',
	file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),
	$a_Matches
	);
var_dump($a_Matches);
?>

Open in new window

That's right isn't it?

And use the preg_replace to make the changes:

e.g. {%mainBody%} -> $array['mainBody']
No.

"Using PHP I want to search through a string and build an array of results of text that is between {% and %}"

"Search" and "build an array of results".

That's preg_match_all()'s territory.

preg_match will only find the first entry.

Oops. Just reread the q.

Outputs ...

array(3) {
  [0]=>
  string(8) "mainBody"
  [1]=>
  string(6) "header"
  [8]=>
  string(4) "xxxx"
}


<?php
preg_match_all
	(
	'`{%(?!.%})([a-z]*?)%}`sim',
	file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),
	$a_Matches,
	PREG_PATTERN_ORDER
	);
$a_Results = array_unique($a_Matches[1]);
var_dump($a_Results);
?>

Open in new window

Then what is wrong with the output in your 2 posts before?

You can parenthesise the  [a-z]*? part to isolate the text between in the results if you want.
So, $a_Results now contains the tags that exist in the template.

Ignore that last post, we posted about the same time ;)
kebabs, yeah, I missed that, but my fix came in just before your comment on the lack of capturing  ().

Oops! Hey. I'm only an expert, not god!
Yes (referring to post about $a_Results)
Ok, so all sorted now?
Umm... RQuadling, I was replying to you as if you were the one who asked the question. Sorry for any confusion!
What? My answer looks like I'm a newbie? Gee! So much for my ZCE then!
Avatar of AlivewithTechnology
AlivewithTechnology

ASKER

so far this code is the closest to what I'm looking for, but it includes the brackets etc, however I can remove them using str_replace if needs be.
I tried RQuadlings solution to remove the brackets but it only seemed to return one result.

Here is what's returned when I run the code below;


Array
(
    [0] => Array
        (
            [0] => {%metaTitle%}
            [1] => {%1mainBody%}
            [2] => {%2newsBody%}
            [3] => {%3footer%}
        )

)

<?php
$filename="templates/template.tpl"; 
$output=""; 
$file = fopen($filename, "r"); 
while(!feof($file)) { 
 
    //read file line by line into variable 
  $output = $output . fgets($file, 4096); 
  
} 
fclose ($file); 
 
preg_match_all('`{%(?!.%}).*?%}`sim',$output,$a_Matches);
 
echo "<pre>";
print_r($a_Matches);
echo "</pre>";
?>

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of ddrudik
ddrudik
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Did you not see this ...

Output is only the tags. No {% %}

array(4) {
  [0]=>
  string(8) "mainBody"
  [1]=>
  string(6) "header"
  [8]=>
  string(4) "xxxx"
  [10]=>
  string(9) "metaTitle"
}

Based upon the current content of THIS page.
<?php
preg_match_all
        (
        '`{%(?!.%})([a-z]*?)%}`sim',
        file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),
        $a_Matches,
        PREG_PATTERN_ORDER
        );
$a_Results = array_unique($a_Matches[1]);
var_dump($a_Results);
?>

Open in new window

You need to always read through all the comments.

kebabs and I went back and forth a bit on this one.
Running ddruik's code on this page outputs ...

<pre>Array
(
    [0] => Array
        (
            [0] =>  and
            [1] => mainBody
            [2] => header
            [3] =>  and
            [4] => mainBody
            [5] => header
            [6] => (?!.
            [7] => mainBody
            [8] => header
            [9] => mainBody
            [10] => header
            [11] => xxxx
            [12] => (?!.
            [13] => mainBody
            [14] =>  and
            [15] => (?!.
            [16] => metaTitle
            [17] => 1mainBody
            [18] => 2newsBody
            [19] => 3footer
            [20] => (?!.
            [21] => mainBody
            [22] => header
            [23] =>
            [24] => (?!.
        )

)
</pre>

AlivewithTechnology, thanks for the question and the points.
The answer given by ddrudik doesn't match the question. Unless I'm missing something.
Actually it does, and you have shown this with the actual output from the pattern.  AlivewithTechnology wanted to match all text blocks between {% and %} which is what my pattern does.  The seemingly odd matches you received in your processing of this page is due to the various pattern constructs and examples that contain text within {% and %}.
How come then my answer matches only the text which is REALLY between {% and %}.


Ah. I used an additional filter of only letters for the tag. But it could be anything.