xybx
asked on
Get text between elements with regex
OK, this ought to be a piece of cake, but for whatever reason, it's become a big deal. I am simply trying to return text between two elements.
For example, suppose I have a form, and it contains
<text>
whatever I want here
maybe there are line breaks
</text>
Now, using PHP, I ought to be able to grab what is in the <text> tags and do other stuff with it.
I've tried madness like, preg_match_all("/(<text>bl ah<\/text> )/",$form_ input,$mat ches) and then looping through it, to no avail. Note that it won't even match on '<t'. I'm [obviously?] new to PHP, and I have read the docs for this, but regardless of my variations, nothing has worked.
thanks for the help
For example, suppose I have a form, and it contains
<text>
whatever I want here
maybe there are line breaks
</text>
Now, using PHP, I ought to be able to grab what is in the <text> tags and do other stuff with it.
I've tried madness like, preg_match_all("/(<text>bl
thanks for the help
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Very closer to what you want was discussed here: https://www.experts-exchange.com/questions/21817029/Extracting-the-text-between-two-given-tags-or-patterns.html
If I clearly understood, then you want to parse something posted through your form. Then the posted data will be in variable, for example:
<?php
$_POST['message'] = "<text>
whatever I want here
maybe there are line breaks
</text>";
?>
So you can try to use Roonaan's variant:
<?php
function getTagText($textString, $tagName)
{
$tag_begin = $tagName;
$tag_end = (3 === func_num_args()) ? func_get_arg(2) : $tagName;
$start = stripos($textString, '<'.$tag_begin.'>')+strlen ($tag_begi n)+2;
$end = stripos($textString, '</'.$tag_end.'>', $start);
if($start && $end) return substr($textString, $start, $end-$start);
return '';
}
var_dump(getTagText($_POST ['message' ], 'text'));
?>
Or if, you want to extract text from all tags, for example some one posted data with more tan one tag-sets:
<?php
$_POST['message'] = "<text>
whatever I want here
maybe there are line breaks
</text><text> another text
goes here</text>";
?>
You can modiify my variant so it would be like this:
<?php
function getTagText($textString, $tagName)
{
$tag_begin = $tagName;
$tag_end = (3 === func_num_args()) ? func_get_arg(2) : $tagName;
preg_match_all("/<{$tag_be gin}[^>]*> (?P<text>. *)<\/{$tag _end}>/Usi ", $textString, $matches, PREG_PATTERN_ORDER);
return $matches["text"];
}
var_dump(getTagText($_POST ['message' ], 'text'));
?>
PS The only one thing you got to fix or notice - it do not understand something like "<text> some text <text>cascade goes...</text> here</text>" So you got to note this, or simply modify it so it could understand this. It's not hard to do.
If I clearly understood, then you want to parse something posted through your form. Then the posted data will be in variable, for example:
<?php
$_POST['message'] = "<text>
whatever I want here
maybe there are line breaks
</text>";
?>
So you can try to use Roonaan's variant:
<?php
function getTagText($textString, $tagName)
{
$tag_begin = $tagName;
$tag_end = (3 === func_num_args()) ? func_get_arg(2) : $tagName;
$start = stripos($textString, '<'.$tag_begin.'>')+strlen
$end = stripos($textString, '</'.$tag_end.'>', $start);
if($start && $end) return substr($textString, $start, $end-$start);
return '';
}
var_dump(getTagText($_POST
?>
Or if, you want to extract text from all tags, for example some one posted data with more tan one tag-sets:
<?php
$_POST['message'] = "<text>
whatever I want here
maybe there are line breaks
</text><text> another text
goes here</text>";
?>
You can modiify my variant so it would be like this:
<?php
function getTagText($textString, $tagName)
{
$tag_begin = $tagName;
$tag_end = (3 === func_num_args()) ? func_get_arg(2) : $tagName;
preg_match_all("/<{$tag_be
return $matches["text"];
}
var_dump(getTagText($_POST
?>
PS The only one thing you got to fix or notice - it do not understand something like "<text> some text <text>cascade goes...</text> here</text>" So you got to note this, or simply modify it so it could understand this. It's not hard to do.
use
preg_match("|<text>(.*)</t ext>|Uis", $input,$ma tches);
print_r($matches);
preg_match("|<text>(.*)</t
print_r($matches);
To make that not greedy add the ? after the .*:
preg_match("|<text>(.*?)</ text>|Uis" ,$input,$m atches);
preg_match("|<text>(.*?)</
Re BogoJoker:
The /U flag marks ungreediness already. The extra ? makes the search greedy again (double negative = positve).
The /U flag marks ungreediness already. The extra ? makes the search greedy again (double negative = positve).
Oh, didn't know that, hehe thanks =)
ASKER
Thanks everyone for the help. I ended up solving this one like I needed..
Here's what I ended up with after too much work.
function ReadBetweenTags($tag,$strI nput) {
global $orig;
$test = $orig;
$pattern = "|^([\w\W\r\n]*?)<".$tag." >([\w\W\r\ n]*?)</".$ tag.">([\w \W\r\n]*)$ |Ui";
while(preg_match($pattern, $test, $result)) {
$before = $result[1];
$match = $result[2];
$after = $result[3];
$test="$before<pre class=\"text\">".HTMLOutpu t($match). "</pre>$af ter";
}
return $test;
}
I'll just divy up the points for those whom I felt helped the most.
Here's what I ended up with after too much work.
function ReadBetweenTags($tag,$strI
global $orig;
$test = $orig;
$pattern = "|^([\w\W\r\n]*?)<".$tag."
while(preg_match($pattern,
$before = $result[1];
$match = $result[2];
$after = $result[3];
$test="$before<pre class=\"text\">".HTMLOutpu
}
return $test;
}
I'll just divy up the points for those whom I felt helped the most.
Try
$form_input = '3453535<text>234234534542
preg_match_all("|<text>(.*
print_r ($matches); // should display the text between the <text>'s.