Solved

PHP's function preg_replace() to skip replacing codes between [code] and [/code]

Posted on 2008-10-26
6
496 Views
Last Modified: 2012-05-05
Hi, I have a following code:

$sTxt = "
Here is the code

[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
";

$aEvents = array
(
    'onActivate',
    'onClick',
    'onMouseDown',
    'onMouseEnter',
    'onMouseLeave',
    'onMouseMove',
    'onMouseOut',
    'onMouseOver',
    'onMouseUp',
);

foreach ($aEvents as $sRemove)
{
    $sTxt = preg_replace('#'.$sRemove.'=#i', 'title=', $sTxt);
}



The purpose of the code above is to eliminate evil codes to prevent injection for data entered via textarea. But I am using bbcode style [code][/code] and all the codes within the code block should be preserved so it can be displayed later using htmlentities(), just like the forum scripts does. How can I substitute evil codes that is outside the [code][/code] blocks only?


Thank you.
0
Comment
Question by:santocki
  • 3
  • 2
6 Comments
 
LVL 7

Expert Comment

by:zhuba
ID: 22809389
Use split() to separate into areas with [code] and not [/code] (or some other regex function) and then only apply the filters to the corresponding sections
0
 
LVL 27

Accepted Solution

by:
ddrudik earned 500 total points
ID: 22809583
Adding this solution to my recommended solution to your previous question:
<?php
$sTxt=<<<EOL
Here is the code
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate='parent.location="mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com"'>
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate=javascript.dosomething()>
</FORM>
[/code]
<hr>
EOL;
function mystriptags($string){
  preg_match_all('~\[code\].*?\[/code\]~is',$string,$codeblocks);
  $string=preg_replace('~\[code\].*?\[/code\]~is',chr(1),$string);
  $string=strip_tags($string);
  foreach($codeblocks[0] as $codeblock){
    $string=preg_replace('/\x1/',preg_replace('/ *(?:on(?:Activate|Click|Mouse(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);
  }
  return $string;
}
echo mystriptags($sTxt);
?>

Open in new window

0
 

Author Comment

by:santocki
ID: 22809633
Dear ddrudik

This is a wondeful solution! Just another question, I would also like to add the words "class" and "javascript" as evil codes, in this can how I can add these two words into the filter list? It looks like you have using combination of "on" and "Activate", "Click", etc. Javascript method words. I am not that good with expressions yet, lol.
0
Are your AD admin tools letting you down?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

 

Author Comment

by:santocki
ID: 22809641
Or better, what would be an easier way to add single words? Because I have complete words in a array like this:

        $aEvents = array
        (
            'onActivate',
            'onAfterPrint',
            'onBeforePrint',
            'onAfterUpdate',
            'onBeforeUpdate',
            'onErrorUpdate',
            'onAbort',
            'onBeforeDeactivate',
            'onDeactivate',
            'onBeforeCopy',
            'onBeforeCut',
            'onBeforeEditFocus',
            'onBeforePaste',
            'onBeforeUnload',
            'onBlur',
            'onBounce',
            'onChange',
            'onClick',
            'onControlSelect',
            'onCopy',
            'onCut',
            'onDblClick',
            'onDrag',
            'onDragEnter',
            'onDragLeave',
            'onDragOver',
            'onDragStart',
            'onDrop',
            'onFilterChange',
            'onDragDrop',
            'onError',
            'onFilterChange',
            'onFinish',
            'onFocus',
            'onHelp',
            'onKeyDown',
            'onKeyPress',
            'onKeyUp',
            'onLoad',
            'OnLoseCapture',
            'onMouseDown',
            'onMouseEnter',
            'onMouseLeave',
            'onMouseMove',
            'onMouseOut',
            'onMouseOver',
            'onMouseUp',
            'onMove',
            'onPaste',
            'onPropertyChange',
            'onReadyStateChange',
            'onReset',
            'onResize',
            'onResizeEnd',
            'onResizeStart',
            'onScroll',
            'onSelectStart',
            'onSelect',
            'onSelectionChange',
            'onStart',
            'onStop',
            'onSubmit',
            'onUnload',
            'class',
            'javascript'
        );


It would be easier for me to add one by one instead of using the ? : conditions, that can be quite confusing to me, lol.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809765
Here's how I constructed the pattern.  I went to my regex tester site:
http://www.myregextester.com

I took your list:
onActivate
onAfterPrint
onBeforePrint
onAfterUpdate
onBeforeUpdate
onErrorUpdate
onAbort
onBeforeDeactivate
onDeactivate
onBeforeCopy
onBeforeCut
onBeforeEditFocus
onBeforePaste
onBeforeUnload
onBlur
onBounce
onChange
onClick
onControlSelect
onCopy
onCut
onDblClick
onDrag
onDragEnter
onDragLeave
onDragOver
onDragStart
onDrop
onFilterChange
onDragDrop
onError
onFilterChange
onFinish
onFocus
onHelp
onKeyDown
onKeyPress
onKeyUp
onLoad
OnLoseCapture
onMouseDown
onMouseEnter
onMouseLeave
onMouseMove
onMouseOut
onMouseOver
onMouseUp
onMove
onPaste
onPropertyChange
onReadyStateChange
onReset
onResize
onResizeEnd
onResizeStart
onScroll
onSelectStart
onSelect
onSelectionChange
onStart
onStop
onSubmit
onUnload
class
javascript

I clicked on "Tools" select list at the site, chose "Word List" and pasted your list in the dialog and clicked "Generate Pattern".

I got this pattern returned:
^(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))$

^ and $ are used to denote the start and end of the entire string, something we don't need for your solution, so I left those off of the pattern in the new line 29:
    $string=preg_replace('/\x1/',preg_replace('/ *(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);

Open in new window

0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809776
The issue with using a non-optimized pattern would be that you might overmatch or undermatch based on the ordering of the alternations in the patterns.

Thanks for the question and the points.
0

Featured Post

ScreenConnect 6.0 Free Trial

Explore all the enhancements in one game-changing release, ScreenConnect 6.0, based on partner feedback. New features include a redesigned UI, app configurations and chat acknowledgement to improve customer engagement!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something …
Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit (http://en.wikipedia.org/wiki/PHPUnit) and similar technologies have enjoyed wide adoption, making it possib…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

778 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question