Solved

Need a regular expression for preg_match

Posted on 2007-12-06
26
217 Views
Last Modified: 2013-12-12
Using PHP I want to search through a string and build an array of results of text that is between {% and %}

For example, I have a string that's a template of a page and will contain things like {%mainBody%} and {%header%}

I want to be able to run through the string and return to an array things like just mainBody and header.

I have no prior experience with regular expressions and although I've searched google I'm having no look finding what I need.
0
Comment
Question by:AlivewithTechnology
  • 14
  • 8
  • 3
  • +1
26 Comments
 
LVL 8

Expert Comment

by:kebabs
ID: 20419162
Crude but it works. Replace $array and $html with your variable names.
preg_match('/{%([^%]+)%}/e', '$array[\'\\1\']', $html);

Open in new window

0
 
LVL 8

Expert Comment

by:kebabs
ID: 20419179
That's supposed to be preg_replace... sorry.
preg_replace('/{%([^%]+)%}/e', '$array[\'\\1\']', $html);

Open in new window

0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419198
The attached script outputs ...

array(1) {
  [0]=>
  array(3) {
    [0]=>
    string(9) "{% and %}"
    [1]=>
    string(12) "{%mainBody%}"
    [2]=>
    string(10) "{%header%}"
  }
}
preg_match_all

	(

	'`{%(?!.%}).*?%}`sim',

	file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),

	$a_Matches

	);

var_dump($a_Matches);

?>

Open in new window

0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419227
If there are only supposed to be letters (no numbers, spaces, etc), then the new output is ...

array(1) {
  [0]=>
  array(4) {
    [0]=>
    string(12) "{%mainBody%}"
    [1]=>
    string(10) "{%header%}"
    [2]=>
    string(12) "{%mainBody%}"
    [3]=>
    string(10) "{%header%}"
  }
}

As I'm looking at THIS page as my source, every time I add a {%xxxx%} it will be found by the code, so really, just rely on the regex.
<?php

preg_match_all

	(

	'`{%(?!.%})[a-z]*?%}`sim',

	file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),

	$a_Matches

	);

var_dump($a_Matches);

?>

Open in new window

0
 
LVL 8

Expert Comment

by:kebabs
ID: 20419232
That's right isn't it?

And use the preg_replace to make the changes:

e.g. {%mainBody%} -> $array['mainBody']
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419251
No.

"Using PHP I want to search through a string and build an array of results of text that is between {% and %}"

"Search" and "build an array of results".

That's preg_match_all()'s territory.

preg_match will only find the first entry.

0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419267
Oops. Just reread the q.

Outputs ...

array(3) {
  [0]=>
  string(8) "mainBody"
  [1]=>
  string(6) "header"
  [8]=>
  string(4) "xxxx"
}


<?php

preg_match_all

	(

	'`{%(?!.%})([a-z]*?)%}`sim',

	file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),

	$a_Matches,

	PREG_PATTERN_ORDER

	);

$a_Results = array_unique($a_Matches[1]);

var_dump($a_Results);

?>

Open in new window

0
 
LVL 8

Expert Comment

by:kebabs
ID: 20419269
Then what is wrong with the output in your 2 posts before?

You can parenthesise the  [a-z]*? part to isolate the text between in the results if you want.
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419277
So, $a_Results now contains the tags that exist in the template.

0
 
LVL 8

Expert Comment

by:kebabs
ID: 20419288
Ignore that last post, we posted about the same time ;)
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419289
kebabs, yeah, I missed that, but my fix came in just before your comment on the lack of capturing  ().

Oops! Hey. I'm only an expert, not god!
0
 
LVL 8

Expert Comment

by:kebabs
ID: 20419295
Yes (referring to post about $a_Results)
0
 
LVL 8

Expert Comment

by:kebabs
ID: 20419305
Ok, so all sorted now?
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 8

Expert Comment

by:kebabs
ID: 20419315
Umm... RQuadling, I was replying to you as if you were the one who asked the question. Sorry for any confusion!
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419332
Yep!

0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419336
What? My answer looks like I'm a newbie? Gee! So much for my ZCE then!
0
 

Author Comment

by:AlivewithTechnology
ID: 20419339
so far this code is the closest to what I'm looking for, but it includes the brackets etc, however I can remove them using str_replace if needs be.
I tried RQuadlings solution to remove the brackets but it only seemed to return one result.

Here is what's returned when I run the code below;


Array
(
    [0] => Array
        (
            [0] => {%metaTitle%}
            [1] => {%1mainBody%}
            [2] => {%2newsBody%}
            [3] => {%3footer%}
        )

)

<?php

$filename="templates/template.tpl"; 

$output=""; 

$file = fopen($filename, "r"); 

while(!feof($file)) { 
 

    //read file line by line into variable 

  $output = $output . fgets($file, 4096); 

  

} 

fclose ($file); 
 

preg_match_all('`{%(?!.%}).*?%}`sim',$output,$a_Matches);
 

echo "<pre>";

print_r($a_Matches);

echo "</pre>";

?>

Open in new window

0
 
LVL 27

Accepted Solution

by:
ddrudik earned 125 total points
ID: 20419538
<?php
$string = <<<EOF
For example, I have a string that's a template of a page and will contain things like {%mainBody%} and {%header%}

I want to be able to run through the string and return to an array things like just mainBody and header.
EOF;
$pattern = '/(?<=\{%)[^%]*(?=%\})/';
preg_match_all($pattern, $string, $array);
echo '<pre>', print_r($array, true), '</pre>';
?>
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419566
Did you not see this ...

Output is only the tags. No {% %}

array(4) {
  [0]=>
  string(8) "mainBody"
  [1]=>
  string(6) "header"
  [8]=>
  string(4) "xxxx"
  [10]=>
  string(9) "metaTitle"
}

Based upon the current content of THIS page.
<?php

preg_match_all

        (

        '`{%(?!.%})([a-z]*?)%}`sim',

        file_get_contents('http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/Q_23005845.html?cid=295'),

        $a_Matches,

        PREG_PATTERN_ORDER

        );

$a_Results = array_unique($a_Matches[1]);

var_dump($a_Results);

?>

Open in new window

0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419582
You need to always read through all the comments.

kebabs and I went back and forth a bit on this one.
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419599
Running ddruik's code on this page outputs ...

<pre>Array
(
    [0] => Array
        (
            [0] =>  and
            [1] => mainBody
            [2] => header
            [3] =>  and
            [4] => mainBody
            [5] => header
            [6] => (?!.
            [7] => mainBody
            [8] => header
            [9] => mainBody
            [10] => header
            [11] => xxxx
            [12] => (?!.
            [13] => mainBody
            [14] =>  and
            [15] => (?!.
            [16] => metaTitle
            [17] => 1mainBody
            [18] => 2newsBody
            [19] => 3footer
            [20] => (?!.
            [21] => mainBody
            [22] => header
            [23] =>
            [24] => (?!.
        )

)
</pre>

0
 
LVL 27

Expert Comment

by:ddrudik
ID: 20419615
AlivewithTechnology, thanks for the question and the points.
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20419627
The answer given by ddrudik doesn't match the question. Unless I'm missing something.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 20419700
Actually it does, and you have shown this with the actual output from the pattern.  AlivewithTechnology wanted to match all text blocks between {% and %} which is what my pattern does.  The seemingly odd matches you received in your processing of this page is due to the various pattern constructs and examples that contain text within {% and %}.
0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20420231
How come then my answer matches only the text which is REALLY between {% and %}.


0
 
LVL 40

Expert Comment

by:RQuadling
ID: 20420258
Ah. I used an additional filter of only letters for the tag. But it could be anything.

0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Suggested Solutions

This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something …
Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this.Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it is …
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to count occurrences of each item in an array.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now