• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 330
  • Last Modified:

remove attribute tag

Hi

I try to clean all the attributes at the tags "P" AND "DIV" except align=* <DIV>

Here is the code:
<?

/// THIS WORK
$data = '===<p er=gg a="b" align=center x="y" er=gg>===';
$data = preg_replace("/<(p|div)[^>]*( align=[^\s>]+)[^>]*>/i", "<$1$2>", $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";

/// THIS WORK
$data = '===<p er=gg a="b" align="center" x="y" er=gg>===';
$data = preg_replace("/<(p|div)[^>]*( align=[^\s>]+)[^>]*>/i", "<$1$2>", $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";

/// THIS DO NOT WORK
$data = '===<p er=gg a="b" x="y" er=gg>===';
$data = preg_replace("/<(p|div)[^>]*( align=[^\s>]+)[^>]*>/i", "<$1$2>", $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";


?>
</body>
</html>
0
bogmar
Asked:
bogmar
  • 4
  • 3
  • 3
  • +1
1 Solution
 
UmeshMySQL Principle Technical Support EngineerCommented:
Try this..

<?

/// THIS WORK
$data = '===<p er=gg a="b" align=center x="y" er=gg>===';
$data = preg_replace("/<(p|div)[^>]*( align=[^\s>]+)[^>]*>/i", "<$1$2>", $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";

/// THIS WORK
$data = '===<p er=gg a="b" align="center" x="y" er=gg>===';
$data = preg_replace("/<(p|div)[^>]*( align=[^\s>]+)[^>]*>/i", "<$1$2>", $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";

/// THIS DO NOT WORK
$data = '===<p er=gg a="b" x="y" er=gg>===';
$data = preg_replace("/<(p|div)[^>]*( align=[^\s>]+)*[^>]*>/i", "<$1$2>", $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";


?>
0
 
UmeshMySQL Principle Technical Support EngineerCommented:
Earlier it was assuming that the attribute align will be there in the <P|Div tag.. I have suffixed the * to ( align=[^\s>]+)..


Hope this Helps!
0
 
bogmarAuthor Commented:
Thanks but you cheated a little bit. The idea is to make it work for any $data variable.
0
[Webinar] Kill tickets & tabs using PowerShell

Are you tired of cycling through the same browser tabs everyday to close the same repetitive tickets? In this webinar JumpCloud will show how you can leverage RESTful APIs to build your own PowerShell modules to kill tickets & tabs using the PowerShell command Invoke-RestMethod.

 
hernst42Commented:
This is a very difficult thing, to do as regex, because if you use
/<(p|div)[^>]*( align=[^\s>]+)?[^>]*>/i
for all data the [^>]* of the regex-machine eats all characters. It should stop if it find the keyword allign. Such things can be done with regular expressions and conditions, but haven't found out how excatly to use that
0
 
hernst42Commented:
An easy way would be to spilt this up into two str-repaces. The one how you do it at the moment. The second and first to use, for d and div-tags which do not contain the align like:
/<(p|div)[^>]>/i
0
 
bogmarAuthor Commented:
let me know if you find a solution.
I will accept any working solution that will work for ANY $data variable.
0
 
bogmarAuthor Commented:
I am sorry hernst42 but I don’t think that I understood correctly.
Can you please provide the code?
0
 
eeBlueShadowCommented:
untested:

$data = preg_replace("/<(p|div)[^>]*?( align=(['\"])\w+\\3)[^>]*>/i", "<$1$2>", $data);
0
 
eeBlueShadowCommented:
oops, I tested it and it doesn't work, lemme keep working on it.
0
 
hernst42Commented:
I thougth with two str_replaces like:

$regex = array("/<(p|div)([^>]*>/i", "/<(p|div)([^>]*)?( align=[^\s>]+)[^>]*>/iu");
$replace = array("<$1>", "<$1$2>");

$data = preg_replace($regex, $replace, $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";

could work, but it will also replace the align :-( So it seems to me that you have conditions in the regex or go a complete other way to remove all tags except the align
0
 
UmeshMySQL Principle Technical Support EngineerCommented:
Check out this..

$data = '===<p er=gg a="b" x="y">===';

if(eregi("align=",$data))
{
    $data = preg_replace("/<(p|div)[^>]*( align=[^\s>]+)[^>]*>/i", "<$1$2>", $data );
}else{

    $data = preg_replace("/<(p|div)[^>]*>/i", "<$1>", $data );
}


echo "<textarea cols=80 rows=5>$data</textarea>\n";
0
 
hernst42Commented:
ok doing it via a regex is nearly impossible. So the following code will work, even if those settings are mixed up.

function sanitize($id, $args) {
    if (preg_match('/(align=[^\s>]*)/i', $args, $m)) {
        $m[1] = preg_replace('/\\\\\"/', '"', $m[1]);
        $m[1] = preg_replace('/\\\\/', '\\', $m[1]);
        return "<$id " . $m[1] . ">";
    }
    return "<$id>";
}

$data = '===<p er=gg a="b" align=center x="y" er=gg>===i ===<p er=gg a="b" x="y" er=gg>===';
$data = preg_replace("/<(p|div)([^>]*)>/ie", "sanitize('$1', '$2')", $data );
echo "<textarea cols=80 rows=5>$data</textarea>\n";

0

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

  • 4
  • 3
  • 3
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now