PHP's function preg_replace() to skip replacing codes between [code] and [/code]

Hi, I have a following code:

$sTxt = "
Here is the code

[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
";

$aEvents = array
(
    'onActivate',
    'onClick',
    'onMouseDown',
    'onMouseEnter',
    'onMouseLeave',
    'onMouseMove',
    'onMouseOut',
    'onMouseOver',
    'onMouseUp',
);

foreach ($aEvents as $sRemove)
{
    $sTxt = preg_replace('#'.$sRemove.'=#i', 'title=', $sTxt);
}



The purpose of the code above is to eliminate evil codes to prevent injection for data entered via textarea. But I am using bbcode style [code][/code] and all the codes within the code block should be preserved so it can be displayed later using htmlentities(), just like the forum scripts does. How can I substitute evil codes that is outside the [code][/code] blocks only?


Thank you.
santockiAsked:
Who is Participating?
 
ddrudikConnect With a Mentor Commented:
Adding this solution to my recommended solution to your previous question:
<?php
$sTxt=<<<EOL
Here is the code
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate='parent.location="mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com"'>
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate=javascript.dosomething()>
</FORM>
[/code]
<hr>
EOL;
function mystriptags($string){
  preg_match_all('~\[code\].*?\[/code\]~is',$string,$codeblocks);
  $string=preg_replace('~\[code\].*?\[/code\]~is',chr(1),$string);
  $string=strip_tags($string);
  foreach($codeblocks[0] as $codeblock){
    $string=preg_replace('/\x1/',preg_replace('/ *(?:on(?:Activate|Click|Mouse(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);
  }
  return $string;
}
echo mystriptags($sTxt);
?>

Open in new window

0
 
zhubaCommented:
Use split() to separate into areas with [code] and not [/code] (or some other regex function) and then only apply the filters to the corresponding sections
0
 
santockiAuthor Commented:
Dear ddrudik

This is a wondeful solution! Just another question, I would also like to add the words "class" and "javascript" as evil codes, in this can how I can add these two words into the filter list? It looks like you have using combination of "on" and "Activate", "Click", etc. Javascript method words. I am not that good with expressions yet, lol.
0
[Webinar] Kill tickets & tabs using PowerShell

Are you tired of cycling through the same browser tabs everyday to close the same repetitive tickets? In this webinar JumpCloud will show how you can leverage RESTful APIs to build your own PowerShell modules to kill tickets & tabs using the PowerShell command Invoke-RestMethod.

 
santockiAuthor Commented:
Or better, what would be an easier way to add single words? Because I have complete words in a array like this:

        $aEvents = array
        (
            'onActivate',
            'onAfterPrint',
            'onBeforePrint',
            'onAfterUpdate',
            'onBeforeUpdate',
            'onErrorUpdate',
            'onAbort',
            'onBeforeDeactivate',
            'onDeactivate',
            'onBeforeCopy',
            'onBeforeCut',
            'onBeforeEditFocus',
            'onBeforePaste',
            'onBeforeUnload',
            'onBlur',
            'onBounce',
            'onChange',
            'onClick',
            'onControlSelect',
            'onCopy',
            'onCut',
            'onDblClick',
            'onDrag',
            'onDragEnter',
            'onDragLeave',
            'onDragOver',
            'onDragStart',
            'onDrop',
            'onFilterChange',
            'onDragDrop',
            'onError',
            'onFilterChange',
            'onFinish',
            'onFocus',
            'onHelp',
            'onKeyDown',
            'onKeyPress',
            'onKeyUp',
            'onLoad',
            'OnLoseCapture',
            'onMouseDown',
            'onMouseEnter',
            'onMouseLeave',
            'onMouseMove',
            'onMouseOut',
            'onMouseOver',
            'onMouseUp',
            'onMove',
            'onPaste',
            'onPropertyChange',
            'onReadyStateChange',
            'onReset',
            'onResize',
            'onResizeEnd',
            'onResizeStart',
            'onScroll',
            'onSelectStart',
            'onSelect',
            'onSelectionChange',
            'onStart',
            'onStop',
            'onSubmit',
            'onUnload',
            'class',
            'javascript'
        );


It would be easier for me to add one by one instead of using the ? : conditions, that can be quite confusing to me, lol.
0
 
ddrudikCommented:
Here's how I constructed the pattern.  I went to my regex tester site:
http://www.myregextester.com

I took your list:
onActivate
onAfterPrint
onBeforePrint
onAfterUpdate
onBeforeUpdate
onErrorUpdate
onAbort
onBeforeDeactivate
onDeactivate
onBeforeCopy
onBeforeCut
onBeforeEditFocus
onBeforePaste
onBeforeUnload
onBlur
onBounce
onChange
onClick
onControlSelect
onCopy
onCut
onDblClick
onDrag
onDragEnter
onDragLeave
onDragOver
onDragStart
onDrop
onFilterChange
onDragDrop
onError
onFilterChange
onFinish
onFocus
onHelp
onKeyDown
onKeyPress
onKeyUp
onLoad
OnLoseCapture
onMouseDown
onMouseEnter
onMouseLeave
onMouseMove
onMouseOut
onMouseOver
onMouseUp
onMove
onPaste
onPropertyChange
onReadyStateChange
onReset
onResize
onResizeEnd
onResizeStart
onScroll
onSelectStart
onSelect
onSelectionChange
onStart
onStop
onSubmit
onUnload
class
javascript

I clicked on "Tools" select list at the site, chose "Word List" and pasted your list in the dialog and clicked "Generate Pattern".

I got this pattern returned:
^(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))$

^ and $ are used to denote the start and end of the entire string, something we don't need for your solution, so I left those off of the pattern in the new line 29:
    $string=preg_replace('/\x1/',preg_replace('/ *(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);

Open in new window

0
 
ddrudikCommented:
The issue with using a non-optimized pattern would be that you might overmatch or undermatch based on the ordering of the alternations in the patterns.

Thanks for the question and the points.
0
All Courses

From novice to tech pro — start learning today.