Solved

regular expression to match all occurrences in a string

Posted on 2011-09-06
11
365 Views
Last Modified: 2012-05-12
Hello Experts..
I have a requirement where I read a file in to a variable (in PHP, using file_get_contents function).
This file contains multiple tags like
<h3 attr="e"><target act="filename"></target></h3> and other tags (is not strict xml)

I need a regular expression (in PHP) that matches the above tag and returns "filename".. or the array of tags (like above) using regular expression.

I tried using preg_match_all but it was in vain..
Please help..
0
Comment
Question by:ansudhindra
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 34

Assisted Solution

by:Beverley Portlock
Beverley Portlock earned 250 total points
ID: 36487712
Try this

<?php

$test = '<h3 attr="e"><target act="filename">test</target></h3><h3 attr="e"><target act="filename">myfile.ext</target></h3>';

preg_match_all( '#<target.*?"filename">([^<]*?)</target#s', $test, $matches );

echo "<pre>";
print_r( $matches[1] );
echo "</pre>";

Open in new window


Which, using the test data above, generates

Array
(
    [0] => test
    [1] => myfile.ext
)
0
 
LVL 13

Author Comment

by:ansudhindra
ID: 36487757
hi bportlock, thanks for your reply..
Your code is nearer to my solution.
what is need is the value of the attribute "act" of "target" tag and not the tag contents. and this "target" tag should come after "h3" tag.
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 36487900
OK, how is this?

<?php

$test = '<h3 attr="e"><target act="filename1">test</target></h3>
                      <target act="filename2">myfile.ext</target>
         <h3><target act="filename3">myfile.ext</target></h3>';

preg_match_all( '#<h3.*?><target.*?act="([^"]*?)">[^<]*?</target>#s', $test, $matches );

echo "<pre>";
print_r( $matches[1] );
echo "</pre>";

Open in new window


Note that the middle test has no h3 tags and is skipped in the output like so

Array
(
    [0] => filename1
    [1] => filename3
)
0
Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

 
LVL 109

Accepted Solution

by:
Ray Paseur earned 250 total points
ID: 36489120
Personally,l I find it easier to understand REGEX if I write it out on several lines with comments, like this.  Have a close look at the variants on the act= attribute between lines 7 and 12.  This should be permissive enough to work for almost anything including HTML5 notation.
http://www.laprbass.com/RAY_temp_ansudhindra.php
Outputs:
Array
(
    [filename1] => myfile1.php
    [filename3] => myfile3.PNG
    [filename4] => this is myfile4
)

Best regards, ~Ray
<?php // RAY_temp_ansudhindra.php
error_reporting(E_ALL);
echo "<pre>";


// TEST DATA FOR THE QUESTION AT EE
$test = <<<ENDSTRING
<h3 attr="e"><target act="filename1">myfile1.php</target></h3>
             <target act="filename2">myfile2.html will not be found because the <h3> is in the wrong place</target>
<h3><target act='filename3'>myfile3.PNG</target></h3>
<h3><target act=filename4 term="foo">this is myfile4</target ></h3>
ENDSTRING;

// CONSTRUCT A REGEX
$regex
= '#'                // REGEX DELIMITER
. '\<h3.*?\>'        // THE <h3> TAG WITH WICKETS ESCAPED
. '<target.*?'       // THE target TAG WITH OPTIONAL ATTRIBUES
. ' act='            // THE act= ATTRIBUTE
. '["\']{0,1}'       // THE QUOTE OR APOSTROPHE - OPTIONAL
. '(.*?)'            // GROUP: THE CONTENTS OF THE act ATTRIBUTE
. '["\' ]{1}'        // THE END OF THE act ATTRIBUTE WITH DOUBLE, SINGLE OR NO QUOTES
. '(.*?)'            // GROUP: WHATEVER FOLLOWS THE act ATTRIBUTE TO THE END OF THE target TAG, IF ANY
. '[>]{1}'           // THE END OF THE target TAG WITH EXACTLY ONE WICKET
. '(.*?)'            // GROUP: THE TEXT MARKED UP BY THE target TAG
. '</target\>??'     // THE CLOSING TARGET TAG
. '#'                // REGEX DELIMITER
. 's'                // TREAT THE STRING AS A SINGLE LINE
. 'i'                // TREAT THE STRING AS CASE-INSENSITIVE
;

// USE THE REGEX
preg_match_all($regex, $test, $matches);

// ACTIVATE THIS TO SEE ALL OF THE MATCHED INFORMATION
// var_dump($matches);

// MAKE AN ARRAY OF KEY => VALUE PAIRS USING THE FIRST AND THIRD GROUPS
foreach ($matches[1] as $num => $filename)
{
    $arr[$filename] = $matches[3][$num];
}

// SHOW THE WORK PRODUCT (EXPECTED TO FIND filename1, filename3 and filename4)
print_r($arr);

Open in new window

0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 36489240
Ray said: "Personally,l I find it easier to understand REGEX if I write it out on several lines with comments"

I find that just makes it even less comprehensible - something that I never thought was possible with regexes...

;-)

0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 36489280
@bportlock: Yes, REGEX is an excursion through the looking glass into a land where the entire language is made up of almost nothing but punctuation.  Who would think of such a thing?  Oh, a 1950's mathematician.  Figures.
http://en.wikipedia.org/wiki/Regular_expression

Best to all, over and out, ~Ray
0
 
LVL 12

Expert Comment

by:tel2
ID: 36499561
Nice work, Ray.  Well layed out (even if your comments are SHOUTING at me).

BTW, do you know why people (including you, I see), generally use "//", as opposed to the shorter "#", for comments in PHP?
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 36501407
"BTW, do you know why people (including you, I see), generally use "//", as opposed to the shorter "#", for comments in PHP?"

Speaking for myself, I came to PHP via C++ and Java and just carried the habit of using //

0
 
LVL 13

Author Closing Comment

by:ansudhindra
ID: 36501428
awesome answers... thanks guys.....
0
 
LVL 109

Expert Comment

by:Ray Paseur
ID: 36503314
Thanks for the points.  @tel2: No real reason for // vs # except habit.  The value of having SHOUTING COMMENTS is twofold:  It makes them easier to see when I glance at my code (and I can search with case-sensitive inspections).  And it tells novice programmers how IMPORTANT COMMENTS CAN BE!
0
 
LVL 12

Expert Comment

by:tel2
ID: 36506974
Thanks Ray.
Well you're not alone in your habit, coz I don't recall ever seeing PHP code with "#" for comments, and I wonder how that habit started, if "#" is a valid (and more concise) alternative.  Unless people find "//" easier to spot, of course.  Or was "#" a more recent addition to PHP, than "//"?
I don't know much PHP, but Perl and shell scripts use "#", so that's what I tend to use in PHP.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
.htaccess 5 37
php string detection problem 7 34
PHP Form Calculate Total Price 10 42
Change background color in td if any value is 1 9 12
Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to dynamically set the form action using jQuery.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question