hankknight
asked on
PHP/REGEX: Automatically close HTML elements
Using PHP, how can I automatically close all HTML tags that need to be closed?
<pre><?php
$html = '
<div>
<div>
<p>
<strong>Hello
</p>
</div>
';
$html = closeHTML($html);
echo htmlentities($html);
function closeHTML($html) {
// Close all open HTML tags that need to be closed
return $html; //
}
?></pre>
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Regular expressions are NOT going to be a good tool to use for this scenario = )
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Oh Ray_Paseur, you're such a kidder = )
@kaufmed +1 :-D
ASKER
OK, I managed to get something to work for one tag only. The problem with my code is that it only fixes one problem. If there is one problem only, it will fix it. If there are six unclosed tags it will only fix the first one.
I understand that the code it creates may not be valid however I cannot use the Tidy extension for this. So even if it closes a <strong> tag in the wrong place, that is fine.
I understand that the code it creates may not be valid however I cannot use the Tidy extension for this. So even if it closes a <strong> tag in the wrong place, that is fine.
<pre><?php
$html = '
<div id="abc">
<div>
<p>
<strong>Hello
</p>
</div>
';
echo htmlentities($html);
echo '<hr />';
echo htmlentities(closeTags($html));
function closeTags($html) {
preg_match_all('#<(?!meta|img|br|hr|input\b)\b([a-z]+)(?: .*)?(?<![/|/ ])>#iU', $html, $result);
$openedtags = $result[1];
preg_match_all('#</([a-z]+)>#iU', $html, $result);
$closedtags = $result[1];
$len_opened = count($openedtags);
if (count($closedtags) == $len_opened) {
return $html;
}
$openedtags = array_reverse($openedtags);
for ($i=0; $i < $len_opened; $i++) {
if (!in_array($openedtags[$i], $closedtags)) {
$html .= '</'.$openedtags[$i].'>';
} else {
unset($closedtags[array_search($openedtags[$i], $closedtags)]);
}
}
return $html;
}
?></pre>
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Perhaps I should have said, "here's why using regex alone for this is a bad idea..." = )
ASKER
Thank you all for your insights. Would it be a better idea to REMOVE the inner-most offending tags which are not closed?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.