Solved

PHP's function preg_replace() to skip replacing codes between [code] and [/code]

Posted on 2008-10-26
6
492 Views
Last Modified: 2012-05-05
Hi, I have a following code:

$sTxt = "
Here is the code

[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
";

$aEvents = array
(
    'onActivate',
    'onClick',
    'onMouseDown',
    'onMouseEnter',
    'onMouseLeave',
    'onMouseMove',
    'onMouseOut',
    'onMouseOver',
    'onMouseUp',
);

foreach ($aEvents as $sRemove)
{
    $sTxt = preg_replace('#'.$sRemove.'=#i', 'title=', $sTxt);
}



The purpose of the code above is to eliminate evil codes to prevent injection for data entered via textarea. But I am using bbcode style [code][/code] and all the codes within the code block should be preserved so it can be displayed later using htmlentities(), just like the forum scripts does. How can I substitute evil codes that is outside the [code][/code] blocks only?


Thank you.
0
Comment
Question by:santocki
  • 3
  • 2
6 Comments
 
LVL 7

Expert Comment

by:zhuba
ID: 22809389
Use split() to separate into areas with [code] and not [/code] (or some other regex function) and then only apply the filters to the corresponding sections
0
 
LVL 27

Accepted Solution

by:
ddrudik earned 500 total points
ID: 22809583
Adding this solution to my recommended solution to your previous question:
<?php

$sTxt=<<<EOL

Here is the code

<hr>

[code]

<FORM>

<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">

</FORM>

[/code]

<hr>

[code]

<FORM>

<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate='parent.location="mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com"'>

</FORM>

[/code]

<hr>

[code]

<FORM>

<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate=javascript.dosomething()>

</FORM>

[/code]

<hr>

EOL;

function mystriptags($string){

  preg_match_all('~\[code\].*?\[/code\]~is',$string,$codeblocks);

  $string=preg_replace('~\[code\].*?\[/code\]~is',chr(1),$string);

  $string=strip_tags($string);

  foreach($codeblocks[0] as $codeblock){

    $string=preg_replace('/\x1/',preg_replace('/ *(?:on(?:Activate|Click|Mouse(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);

  }

  return $string;

}

echo mystriptags($sTxt);

?>

Open in new window

0
 

Author Comment

by:santocki
ID: 22809633
Dear ddrudik

This is a wondeful solution! Just another question, I would also like to add the words "class" and "javascript" as evil codes, in this can how I can add these two words into the filter list? It looks like you have using combination of "on" and "Activate", "Click", etc. Javascript method words. I am not that good with expressions yet, lol.
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 

Author Comment

by:santocki
ID: 22809641
Or better, what would be an easier way to add single words? Because I have complete words in a array like this:

        $aEvents = array
        (
            'onActivate',
            'onAfterPrint',
            'onBeforePrint',
            'onAfterUpdate',
            'onBeforeUpdate',
            'onErrorUpdate',
            'onAbort',
            'onBeforeDeactivate',
            'onDeactivate',
            'onBeforeCopy',
            'onBeforeCut',
            'onBeforeEditFocus',
            'onBeforePaste',
            'onBeforeUnload',
            'onBlur',
            'onBounce',
            'onChange',
            'onClick',
            'onControlSelect',
            'onCopy',
            'onCut',
            'onDblClick',
            'onDrag',
            'onDragEnter',
            'onDragLeave',
            'onDragOver',
            'onDragStart',
            'onDrop',
            'onFilterChange',
            'onDragDrop',
            'onError',
            'onFilterChange',
            'onFinish',
            'onFocus',
            'onHelp',
            'onKeyDown',
            'onKeyPress',
            'onKeyUp',
            'onLoad',
            'OnLoseCapture',
            'onMouseDown',
            'onMouseEnter',
            'onMouseLeave',
            'onMouseMove',
            'onMouseOut',
            'onMouseOver',
            'onMouseUp',
            'onMove',
            'onPaste',
            'onPropertyChange',
            'onReadyStateChange',
            'onReset',
            'onResize',
            'onResizeEnd',
            'onResizeStart',
            'onScroll',
            'onSelectStart',
            'onSelect',
            'onSelectionChange',
            'onStart',
            'onStop',
            'onSubmit',
            'onUnload',
            'class',
            'javascript'
        );


It would be easier for me to add one by one instead of using the ? : conditions, that can be quite confusing to me, lol.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809765
Here's how I constructed the pattern.  I went to my regex tester site:
http://www.myregextester.com

I took your list:
onActivate
onAfterPrint
onBeforePrint
onAfterUpdate
onBeforeUpdate
onErrorUpdate
onAbort
onBeforeDeactivate
onDeactivate
onBeforeCopy
onBeforeCut
onBeforeEditFocus
onBeforePaste
onBeforeUnload
onBlur
onBounce
onChange
onClick
onControlSelect
onCopy
onCut
onDblClick
onDrag
onDragEnter
onDragLeave
onDragOver
onDragStart
onDrop
onFilterChange
onDragDrop
onError
onFilterChange
onFinish
onFocus
onHelp
onKeyDown
onKeyPress
onKeyUp
onLoad
OnLoseCapture
onMouseDown
onMouseEnter
onMouseLeave
onMouseMove
onMouseOut
onMouseOver
onMouseUp
onMove
onPaste
onPropertyChange
onReadyStateChange
onReset
onResize
onResizeEnd
onResizeStart
onScroll
onSelectStart
onSelect
onSelectionChange
onStart
onStop
onSubmit
onUnload
class
javascript

I clicked on "Tools" select list at the site, chose "Word List" and pasted your list in the dialog and clicked "Generate Pattern".

I got this pattern returned:
^(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))$

^ and $ are used to denote the start and end of the entire string, something we don't need for your solution, so I left those off of the pattern in the new line 29:
    $string=preg_replace('/\x1/',preg_replace('/ *(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);

Open in new window

0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809776
The issue with using a non-optimized pattern would be that you might overmatch or undermatch based on the ordering of the alternations in the patterns.

Thanks for the question and the points.
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something …
This article discusses four methods for overlaying images in a container on a web page
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now