Solved

Swear word filter on input

Posted on 2004-08-18
13
1,299 Views
Last Modified: 2010-08-05
Hi,

Ive got a real hard question for someone to solve....Upon insert of a record into a table, i am using a trigger to check for certain data, one of the things i want to do is to check for swear words, if there are swear words then i want to delete the record...

Does anyone know of a method of setting up perhaps an array of swear words that i can check the input against??? One of my concerns is checking a string against potentially 100/200 swear words, i think this could add a certain amount of strain on the server load, so any suggestions for this too would be very much appreciated.

Thanks in advance guys,

Al
0
Comment
Question by:higgsy
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
  • +2
13 Comments
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 11834166
>> Does anyone know of a method of setting up perhaps an array of swear words that i can check the input against? <<

Basically just create a table to hold them.  But this will still add quite a bit of overhead, especially if you have to check fairly long entries.  You could also try full-text indexing the column(s) and using lookups on that to remove offending row(s).
0
 
LVL 15

Accepted Solution

by:
jdlambert1 earned 500 total points
ID: 11834398
You can also expect that you'll either have to do precise matches or fuzzy matches.
If you do fuzzy matches, you can expect a lot of "good" words to get hits.
If you do precise matches, you have to decide what to do with words that have more than one meaning. And if you get into evaluating context, that's a area where room fulls of PhD's are working on automated grammatical analysis, with the bottom line that there's no easy way to do it. Plus, your precise control list is likely to always be missing some "bad" words.

Performance aside, this is a very difficult area...
0
 
LVL 18

Expert Comment

by:SjoerdVerweij
ID: 11834408
Let us know if you need more specifics (i.e., code).
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:higgsy
ID: 11835429
code would be great guys...

Thanks

Al
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 11835554
I urge you to carefully review jdlambert1's issues and resolve those before (prematurely) worrying about code.  You should do the "analysis" and "design" of this before you worry about code.  No offense to anyone, but coding is the easiest part of it once a proper design and clear goals are determined.
0
 
LVL 18

Expert Comment

by:SjoerdVerweij
ID: 11835822
True. To get into specifics, which one of these should go through? (Note: read * as i

sh*t

s h * t

smashit

sh1t

etc.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 11835863
And if you try to go phonetically, like SOUNDEX or something similar,  what about:
shiitake mushrooms?
0
 
LVL 18

Expert Comment

by:SjoerdVerweij
ID: 11836485
Not to mention you'd have to do the pig-Latin ones, like pr0n.
0
 
LVL 9

Expert Comment

by:dancebert
ID: 11836556
To quote (ok, paraphrase) the great George Carlin:

'You can prick your finger but you can't finger your prick.'

A baseball announcer can say 'Roberto Clemente has two balls on him.' But he can't say, 'I think he hurt his balls on that play'

0
 
LVL 18

Expert Comment

by:SjoerdVerweij
ID: 11836579
Not to mention regional variances.

"I am going to smoke a fag"

Nicotine-related in the UK, homicide-related in the US.
0
 

Author Comment

by:higgsy
ID: 11879356
Hi guys,

You've all made valid points, and ive had some time to reflect on the design. Instead of deleting the record if a swear word is found, i am simply going to send an alert to the website administrator so we can just go and check it....In the past when people have left comments etc on the website i have a flag in the database called IsAuthorised, therefore until we check the content the record can't be viewed on the website....This is not only very time costly but also annoys users as they dont get to see their post straight away...

What do you guys recommend?? If you agree with me solution, does anyone have code that will search a string for a series of swear words, almost like an array???

Thanks guys

Al
0
 
LVL 18

Expert Comment

by:SjoerdVerweij
ID: 11885046
Most blog comments now revert to some type of authorization mechanism.

To do a rough and ready:

create table NaughtyWords(Word As VarChar(50))

insert into NaughtyWords(Word) Values('Belgium')
... etc ...

Then a @value would be ok if

Not Exists(Select * From NaughtyWords
  Where CharIndex(Word, @Value) > 0)

To deal with B`e`l`g`i`u`m etc., replace @Value in the above with Replace(@Value, '`', ''). You can string these together for other characters. Note that this does increase the likelihood of false positives.
0
 
LVL 9

Expert Comment

by:dancebert
ID: 11885350
1. Make a list of words that are unacceptable, no matter what the context. For example:  S*it, F*ck, C*unt, C*cksucker, m*therf*cker.  This will be a short list unless you're willing to include languges other than English.

2. Make a list of phrases that are unacceptable, no matter what the context.  These phrases will include words that have multiple meanings so they can't be included in the first category.  For example, Tit is a bird species, and so is the Great Tit.  "Great Tits" is a common phrase among bird watchers in certain parts of the world. However, 'Suck my t*ts' obviously has no redeeming social value.

3. Monitor what gets through and add new phrases to #2 as needed.

4. Tell the people who object to the things that get through to get a life.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Server 2012 r2 and SQL 2014 6 34
sql, how to change the data type after data loading? 9 59
Report 8 27
sql server query 9 27
Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Viewers will learn how the fundamental information of how to create a table.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question