Solved

PHP's function preg_replace() to skip replacing codes between [code] and [/code]

Posted on 2008-10-26
6
498 Views
Last Modified: 2012-05-05
Hi, I have a following code:

$sTxt = "
Here is the code

[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
";

$aEvents = array
(
    'onActivate',
    'onClick',
    'onMouseDown',
    'onMouseEnter',
    'onMouseLeave',
    'onMouseMove',
    'onMouseOut',
    'onMouseOver',
    'onMouseUp',
);

foreach ($aEvents as $sRemove)
{
    $sTxt = preg_replace('#'.$sRemove.'=#i', 'title=', $sTxt);
}



The purpose of the code above is to eliminate evil codes to prevent injection for data entered via textarea. But I am using bbcode style [code][/code] and all the codes within the code block should be preserved so it can be displayed later using htmlentities(), just like the forum scripts does. How can I substitute evil codes that is outside the [code][/code] blocks only?


Thank you.
0
Comment
Question by:santocki
  • 3
  • 2
6 Comments
 
LVL 7

Expert Comment

by:zhuba
ID: 22809389
Use split() to separate into areas with [code] and not [/code] (or some other regex function) and then only apply the filters to the corresponding sections
0
 
LVL 27

Accepted Solution

by:
ddrudik earned 500 total points
ID: 22809583
Adding this solution to my recommended solution to your previous question:
<?php
$sTxt=<<<EOL
Here is the code
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onClick="parent.location='mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com'">
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate='parent.location="mailto:webmaster@domain.com?subject=Email from Domain.com&cc=webmaster@domain.com"'>
</FORM>
[/code]
<hr>
[code]
<FORM>
<INPUT TYPE="button" VALUE="Click Here to Email Webmaster" onActivate=javascript.dosomething()>
</FORM>
[/code]
<hr>
EOL;
function mystriptags($string){
  preg_match_all('~\[code\].*?\[/code\]~is',$string,$codeblocks);
  $string=preg_replace('~\[code\].*?\[/code\]~is',chr(1),$string);
  $string=strip_tags($string);
  foreach($codeblocks[0] as $codeblock){
    $string=preg_replace('/\x1/',preg_replace('/ *(?:on(?:Activate|Click|Mouse(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);
  }
  return $string;
}
echo mystriptags($sTxt);
?>

Open in new window

0
 

Author Comment

by:santocki
ID: 22809633
Dear ddrudik

This is a wondeful solution! Just another question, I would also like to add the words "class" and "javascript" as evil codes, in this can how I can add these two words into the filter list? It looks like you have using combination of "on" and "Activate", "Click", etc. Javascript method words. I am not that good with expressions yet, lol.
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:santocki
ID: 22809641
Or better, what would be an easier way to add single words? Because I have complete words in a array like this:

        $aEvents = array
        (
            'onActivate',
            'onAfterPrint',
            'onBeforePrint',
            'onAfterUpdate',
            'onBeforeUpdate',
            'onErrorUpdate',
            'onAbort',
            'onBeforeDeactivate',
            'onDeactivate',
            'onBeforeCopy',
            'onBeforeCut',
            'onBeforeEditFocus',
            'onBeforePaste',
            'onBeforeUnload',
            'onBlur',
            'onBounce',
            'onChange',
            'onClick',
            'onControlSelect',
            'onCopy',
            'onCut',
            'onDblClick',
            'onDrag',
            'onDragEnter',
            'onDragLeave',
            'onDragOver',
            'onDragStart',
            'onDrop',
            'onFilterChange',
            'onDragDrop',
            'onError',
            'onFilterChange',
            'onFinish',
            'onFocus',
            'onHelp',
            'onKeyDown',
            'onKeyPress',
            'onKeyUp',
            'onLoad',
            'OnLoseCapture',
            'onMouseDown',
            'onMouseEnter',
            'onMouseLeave',
            'onMouseMove',
            'onMouseOut',
            'onMouseOver',
            'onMouseUp',
            'onMove',
            'onPaste',
            'onPropertyChange',
            'onReadyStateChange',
            'onReset',
            'onResize',
            'onResizeEnd',
            'onResizeStart',
            'onScroll',
            'onSelectStart',
            'onSelect',
            'onSelectionChange',
            'onStart',
            'onStop',
            'onSubmit',
            'onUnload',
            'class',
            'javascript'
        );


It would be easier for me to add one by one instead of using the ? : conditions, that can be quite confusing to me, lol.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809765
Here's how I constructed the pattern.  I went to my regex tester site:
http://www.myregextester.com

I took your list:
onActivate
onAfterPrint
onBeforePrint
onAfterUpdate
onBeforeUpdate
onErrorUpdate
onAbort
onBeforeDeactivate
onDeactivate
onBeforeCopy
onBeforeCut
onBeforeEditFocus
onBeforePaste
onBeforeUnload
onBlur
onBounce
onChange
onClick
onControlSelect
onCopy
onCut
onDblClick
onDrag
onDragEnter
onDragLeave
onDragOver
onDragStart
onDrop
onFilterChange
onDragDrop
onError
onFilterChange
onFinish
onFocus
onHelp
onKeyDown
onKeyPress
onKeyUp
onLoad
OnLoseCapture
onMouseDown
onMouseEnter
onMouseLeave
onMouseMove
onMouseOut
onMouseOver
onMouseUp
onMove
onPaste
onPropertyChange
onReadyStateChange
onReset
onResize
onResizeEnd
onResizeStart
onScroll
onSelectStart
onSelect
onSelectionChange
onStart
onStop
onSubmit
onUnload
class
javascript

I clicked on "Tools" select list at the site, chose "Word List" and pasted your list in the dialog and clicked "Generate Pattern".

I got this pattern returned:
^(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))$

^ and $ are used to denote the start and end of the entire string, something we don't need for your solution, so I left those off of the pattern in the new line 29:
    $string=preg_replace('/\x1/',preg_replace('/ *(?:OnLoseCapture|class|javascript|on(?:A(?:bort|ctivate|fter(?:Print|Update))|B(?:efore(?:C(?:opy|ut)|Deactivate|EditFocus|P(?:aste|rint)|U(?:nload|pdate))|lur|ounce)|C(?:hange|lick|o(?:ntrolSelect|py)|ut)|D(?:blClick|eactivate|r(?:ag(?:Drop|Enter|Leave|Over|Start)?|op))|Error(?:Update)?|F(?:i(?:lterChange|nish)|ocus)|Help|Key(?:Down|Press|Up)|Load|Mo(?:use(?:Down|Enter|Leave|Move|O(?:ut|ver)|Up)|ve)|P(?:ast|ropertyChang)e|Re(?:adyStateChange|s(?:et|ize(?:End|Start)?))|S(?:croll|elect(?:Start|ionChange)?|t(?:art|op)|ubmit)|Unload))=(?:(["\'])(?:(?!\1|>).)+\1|(?:(?!>)\S)+)/is','',$codeblock),$string,1);

Open in new window

0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22809776
The issue with using a non-optimized pattern would be that you might overmatch or undermatch based on the ordering of the alternations in the patterns.

Thanks for the question and the points.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How do uses indexes to maximize MySQL Searches 14 55
Number of hours between date in DB and now 8 21
Why is my wamp get_include_path() wrong? 2 21
PHP Echo with auto submit 8 28
Deprecated and Headed for the Dustbin By now, you have probably heard that some PHP features, while convenient, can also cause PHP security problems.  This article discusses one of those, called register_globals.  It is a thing you do not want.  …
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

860 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question