Solved

PHP's function preg_replace() to skip replacing codes between [code] and [/code]

Posted on 2008-10-26
6
499 Views
Last Modified: 2012-05-05
Hi, I have a following code:

$sTxt = "
Here is the code

[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
";

$aEvents = array
(
    'onActivate',
    'onClick',
    'onMouseDown',
    'onMouseEnter',
    'onMouseLeave',
    'onMouseMove',
    'onMouseOut',
    'onMouseOver',
    'onMouseUp',
);

foreach ($aEvents as $sRemove)
{
    $sTxt = preg_replace('#'.$sRemove.'=#i', 'title=', $sTxt);
}



The purpose of the code above is to eliminate evil codes to prevent injection for data entered via textarea. But I am using bbcode style [code][/code] and all the codes within the code block should be preserved so it can be displayed later using htmlentities(), just like the forum scripts does. How can I substitute evil codes that is outside the [code][/code] blocks only?


Thank you.
0
Comment
Question by:santocki
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 7

Expert Comment

by:zhuba
ID: 22809389
Use split() to separate into areas with [code] and not [/code] (or some other regex function) and then only apply the filters to the corresponding sections
0
 
LVL 27

Accepted Solution

by:
ddrudik earned 500 total points
ID: 22809583
Adding this solution to my recommended solution to your previous question:
<?php
$sTxt=<<<EOL
Here is the code
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate='parent.location="mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com"'>
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate=javascript.dosomething()>
</FORM>
[/code]
<hr>
EOL;
function mystriptags($string){
  preg_match_all('~\[code\].*?\[/code\]~is',$string,$codeblocks);
  $string=preg_replace('~\[code\].*?\[/code\]~is',chr(1),$string);
  $string=strip_tags($string);
  foreach($codeblocks[0] as $codeblock){
    $string=preg_replace('/\x1/',preg_replace('/ *(?:on(?:Activate|Click|Mouse(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);
  }
  return $string;
}
echo mystriptags($sTxt);
?>

Open in new window

0
 

Author Comment

by:santocki
ID: 22809633
Dear ddrudik

This is a wondeful solution! Just another question, I would also like to add the words "class" and "javascript" as evil codes, in this can how I can add these two words into the filter list? It looks like you have using combination of "on" and "Activate", "Click", etc. Javascript method words. I am not that good with expressions yet, lol.
0
Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

 

Author Comment

by:santocki
ID: 22809641
Or better, what would be an easier way to add single words? Because I have complete words in a array like this:

        $aEvents = array
        (
            'onActivate',
            'onAfterPrint',
            'onBeforePrint',
            'onAfterUpdate',
            'onBeforeUpdate',
            'onErrorUpdate',
            'onAbort',
            'onBeforeDeactivate',
            'onDeactivate',
            'onBeforeCopy',
            'onBeforeCut',
            'onBeforeEditFocus',
            'onBeforePaste',
            'onBeforeUnload',
            'onBlur',
            'onBounce',
            'onChange',
            'onClick',
            'onControlSelect',
            'onCopy',
            'onCut',
            'onDblClick',
            'onDrag',
            'onDragEnter',
            'onDragLeave',
            'onDragOver',
            'onDragStart',
            'onDrop',
            'onFilterChange',
            'onDragDrop',
            'onError',
            'onFilterChange',
            'onFinish',
            'onFocus',
            'onHelp',
            'onKeyDown',
            'onKeyPress',
            'onKeyUp',
            'onLoad',
            'OnLoseCapture',
            'onMouseDown',
            'onMouseEnter',
            'onMouseLeave',
            'onMouseMove',
            'onMouseOut',
            'onMouseOver',
            'onMouseUp',
            'onMove',
            'onPaste',
            'onPropertyChange',
            'onReadyStateChange',
            'onReset',
            'onResize',
            'onResizeEnd',
            'onResizeStart',
            'onScroll',
            'onSelectStart',
            'onSelect',
            'onSelectionChange',
            'onStart',
            'onStop',
            'onSubmit',
            'onUnload',
            'class',
            'javascript'
        );


It would be easier for me to add one by one instead of using the ? : conditions, that can be quite confusing to me, lol.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809765
Here's how I constructed the pattern.  I went to my regex tester site:
http://www.myregextester.com

I took your list:
onActivate
onAfterPrint
onBeforePrint
onAfterUpdate
onBeforeUpdate
onErrorUpdate
onAbort
onBeforeDeactivate
onDeactivate
onBeforeCopy
onBeforeCut
onBeforeEditFocus
onBeforePaste
onBeforeUnload
onBlur
onBounce
onChange
onClick
onControlSelect
onCopy
onCut
onDblClick
onDrag
onDragEnter
onDragLeave
onDragOver
onDragStart
onDrop
onFilterChange
onDragDrop
onError
onFilterChange
onFinish
onFocus
onHelp
onKeyDown
onKeyPress
onKeyUp
onLoad
OnLoseCapture
onMouseDown
onMouseEnter
onMouseLeave
onMouseMove
onMouseOut
onMouseOver
onMouseUp
onMove
onPaste
onPropertyChange
onReadyStateChange
onReset
onResize
onResizeEnd
onResizeStart
onScroll
onSelectStart
onSelect
onSelectionChange
onStart
onStop
onSubmit
onUnload
class
javascript

I clicked on "Tools" select list at the site, chose "Word List" and pasted your list in the dialog and clicked "Generate Pattern".

I got this pattern returned:
^(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))$

^ and $ are used to denote the start and end of the entire string, something we don't need for your solution, so I left those off of the pattern in the new line 29:
    $string=preg_replace('/\x1/',preg_replace('/ *(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);

Open in new window

0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809776
The issue with using a non-optimized pattern would be that you might overmatch or undermatch based on the ordering of the alternations in the patterns.

Thanks for the question and the points.
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
PHP delete contents of file- before writing to it 6 50
Wordpress and Wufoo 1 40
How to use session variables in php? 22 52
[form-control] Retain values after a POST action 21 38
Popularity Can Be Measured Sometimes we deal with questions of popularity, and we need a way to collect opinions from our clients.  This article shows a simple teaching example of how we might elect a favorite color by letting our clients vote for …
Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit (http://en.wikipedia.org/wiki/PHPUnit) and similar technologies have enjoyed wide adoption, making it possib…
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question