Solved

regular expression to match all occurrences in a string

Posted on 2011-09-06
11
370 Views
Last Modified: 2012-05-12
Hello Experts..
I have a requirement where I read a file in to a variable (in PHP, using file_get_contents function).
This file contains multiple tags like
<h3 attr="e"><target act="filename"></target></h3> and other tags (is not strict xml)

I need a regular expression (in PHP) that matches the above tag and returns "filename".. or the array of tags (like above) using regular expression.

I tried using preg_match_all but it was in vain..
Please help..
0
Comment
Question by:ansudhindra
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 34

Assisted Solution

by:Beverley Portlock
Beverley Portlock earned 250 total points
ID: 36487712
Try this

<?php

$test = '<h3 attr="e"><target act="filename">test</target></h3><h3 attr="e"><target act="filename">myfile.ext</target></h3>';

preg_match_all( '#<target.*?"filename">([^<]*?)</target#s', $test, $matches );

echo "<pre>";
print_r( $matches[1] );
echo "</pre>";

Open in new window


Which, using the test data above, generates

Array
(
    [0] => test
    [1] => myfile.ext
)
0
 
LVL 13

Author Comment

by:ansudhindra
ID: 36487757
hi bportlock, thanks for your reply..
Your code is nearer to my solution.
what is need is the value of the attribute "act" of "target" tag and not the tag contents. and this "target" tag should come after "h3" tag.
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 36487900
OK, how is this?

<?php

$test = '<h3 attr="e"><target act="filename1">test</target></h3>
                      <target act="filename2">myfile.ext</target>
         <h3><target act="filename3">myfile.ext</target></h3>';

preg_match_all( '#<h3.*?><target.*?act="([^"]*?)">[^<]*?</target>#s', $test, $matches );

echo "<pre>";
print_r( $matches[1] );
echo "</pre>";

Open in new window


Note that the middle test has no h3 tags and is skipped in the output like so

Array
(
    [0] => filename1
    [1] => filename3
)
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 110

Accepted Solution

by:
Ray Paseur earned 250 total points
ID: 36489120
Personally,l I find it easier to understand REGEX if I write it out on several lines with comments, like this.  Have a close look at the variants on the act= attribute between lines 7 and 12.  This should be permissive enough to work for almost anything including HTML5 notation.
http://www.laprbass.com/RAY_temp_ansudhindra.php
Outputs:
Array
(
    [filename1] => myfile1.php
    [filename3] => myfile3.PNG
    [filename4] => this is myfile4
)

Best regards, ~Ray
<?php // RAY_temp_ansudhindra.php
error_reporting(E_ALL);
echo "<pre>";


// TEST DATA FOR THE QUESTION AT EE
$test = <<<ENDSTRING
<h3 attr="e"><target act="filename1">myfile1.php</target></h3>
             <target act="filename2">myfile2.html will not be found because the <h3> is in the wrong place</target>
<h3><target act='filename3'>myfile3.PNG</target></h3>
<h3><target act=filename4 term="foo">this is myfile4</target ></h3>
ENDSTRING;

// CONSTRUCT A REGEX
$regex
= '#'                // REGEX DELIMITER
. '\<h3.*?\>'        // THE <h3> TAG WITH WICKETS ESCAPED
. '<target.*?'       // THE target TAG WITH OPTIONAL ATTRIBUES
. ' act='            // THE act= ATTRIBUTE
. '["\']{0,1}'       // THE QUOTE OR APOSTROPHE - OPTIONAL
. '(.*?)'            // GROUP: THE CONTENTS OF THE act ATTRIBUTE
. '["\' ]{1}'        // THE END OF THE act ATTRIBUTE WITH DOUBLE, SINGLE OR NO QUOTES
. '(.*?)'            // GROUP: WHATEVER FOLLOWS THE act ATTRIBUTE TO THE END OF THE target TAG, IF ANY
. '[>]{1}'           // THE END OF THE target TAG WITH EXACTLY ONE WICKET
. '(.*?)'            // GROUP: THE TEXT MARKED UP BY THE target TAG
. '</target\>??'     // THE CLOSING TARGET TAG
. '#'                // REGEX DELIMITER
. 's'                // TREAT THE STRING AS A SINGLE LINE
. 'i'                // TREAT THE STRING AS CASE-INSENSITIVE
;

// USE THE REGEX
preg_match_all($regex, $test, $matches);

// ACTIVATE THIS TO SEE ALL OF THE MATCHED INFORMATION
// var_dump($matches);

// MAKE AN ARRAY OF KEY => VALUE PAIRS USING THE FIRST AND THIRD GROUPS
foreach ($matches[1] as $num => $filename)
{
    $arr[$filename] = $matches[3][$num];
}

// SHOW THE WORK PRODUCT (EXPECTED TO FIND filename1, filename3 and filename4)
print_r($arr);

Open in new window

0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 36489240
Ray said: "Personally,l I find it easier to understand REGEX if I write it out on several lines with comments"

I find that just makes it even less comprehensible - something that I never thought was possible with regexes...

;-)

0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 36489280
@bportlock: Yes, REGEX is an excursion through the looking glass into a land where the entire language is made up of almost nothing but punctuation.  Who would think of such a thing?  Oh, a 1950's mathematician.  Figures.
http://en.wikipedia.org/wiki/Regular_expression

Best to all, over and out, ~Ray
0
 
LVL 12

Expert Comment

by:tel2
ID: 36499561
Nice work, Ray.  Well layed out (even if your comments are SHOUTING at me).

BTW, do you know why people (including you, I see), generally use "//", as opposed to the shorter "#", for comments in PHP?
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 36501407
"BTW, do you know why people (including you, I see), generally use "//", as opposed to the shorter "#", for comments in PHP?"

Speaking for myself, I came to PHP via C++ and Java and just carried the habit of using //

0
 
LVL 13

Author Closing Comment

by:ansudhindra
ID: 36501428
awesome answers... thanks guys.....
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 36503314
Thanks for the points.  @tel2: No real reason for // vs # except habit.  The value of having SHOUTING COMMENTS is twofold:  It makes them easier to see when I glance at my code (and I can search with case-sensitive inspections).  And it tells novice programmers how IMPORTANT COMMENTS CAN BE!
0
 
LVL 12

Expert Comment

by:tel2
ID: 36506974
Thanks Ray.
Well you're not alone in your habit, coz I don't recall ever seeing PHP code with "#" for comments, and I wonder how that habit started, if "#" is a valid (and more concise) alternative.  Unless people find "//" easier to spot, of course.  Or was "#" a more recent addition to PHP, than "//"?
I don't know much PHP, but Perl and shell scripts use "#", so that's what I tend to use in PHP.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question