Regular expressions and word filtering in ColdFusion - really need some help!!
Posted on 2006-05-11
Hi! I am trying to create a language filter using regular expressions for detecting if any word in a paragraph is what is considered an adult word (at least on our list) - and then flag the paragraph. I've gotten most of it done, but now I am struggling with one part and was really hoping someone might be able to help.
What I am trying to do is check for weird characters (up to 2 between each letter) that might be separating an "adult" word and be able to determine that it is in fact a word in the adult list and therefore flag the paragraph as adult.
For example, say one of the adult words is the word "adult", and a person typed it in as such A**D**U^^L%%T - I am trying to write a regular expression that can test of any special characters within the word, and see if without them the word fits into the adult criteria.
Here is what I have so far, checking for spaces between letters to see if the word exists in the string and checking for any basic variation of the word, I just need help with the other regular expression to complete it. I'm terrible with regular expressions, so any help or any improvements on what I've got so far would be welcome as well!
<cffunction name="cleanString" returnType="string" output="false">
<cfargument name="string" type="string" required="true">
<cfargument name="badwords" type="string" required="false" default="adult">
<cfset var word = "">
<cfset var y = "">
<cfset var newword = "">
<cfloop index="word" list="#arguments.badwords#">
<cfset newword = "">
<cfloop index="y" from="1" to="#len(word)#">
<cfset newword = newword & mid(word, y, 1) & "\s*">
<cfif (reFindNoCase("\b#newword#\b", arguments.string)) or (reFindNoCase("#newword#(ing|ed|er|r|\b)", arguments.string))>
<cfset count = 1>
Of course, if anyone another solution that would work better, I am totally open to suggestions! Also - if any part of this doesn't make sense - just let me know and I'll try to explain it more clearly! Thanks ahead of time for your help!