How do I use REreplace to filter out abbreviations?

I am using Verity and needed to expand the "stop words" list to filter out common terms used in companies names such as: company, incorporated, corporation, etc..

Not only did I need to filter out those terms but their abbreviations as well such as: com, inc, inc., corp, etc...

I initially used replacelist which works great however it hiccups when filtering abbreviations. For example, while using replacelist to filter "inc" works it also changes "Lincoln" to "L oln" and "Communications" to "munications"

The solution, I believe, lies in passing the result from replacelist to a REreplace filter but I am not positive how to write the regular expression. I would want to filter any abbreviations with and without a period. Below is the code I have so far. I've shortened the list of terms I am replacing since it is quite long.
<cfset search_term = lcase(url.searchTerm)>
<cfset search_term_cleaned = replaceList(search_term, "associates,assoc,bank,companies,company,com,corp,holdings,incorporated,industries,trust,corporation"," , , , , , , , , , , , , ,")>
<cfset search_term_final = REreplace(search_term_cleaned, "REGEX here","")>

Open in new window

futr_visionAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ddrudikCommented:
\binc\b would match "inc" when it is bordered by a \W character [^A-Za-z0-9_] or start/end of a string, maybe that will help you.
0
futr_visionAuthor Commented:
I have series of these abbreviations I need to filter. Is there a way to include them all in one REreplace statement?
0
ddrudikCommented:
\b(one|two|three)\b
or
\b(?:one|two|three)\b
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Newly released Acronis True Image 2019

In announcing the release of the 15th Anniversary Edition of Acronis True Image 2019, the company revealed that its artificial intelligence-based anti-ransomware technology – stopped more than 200,000 ransomware attacks on 150,000 customers last year.

futr_visionAuthor Commented:
Great! And if they use a period after the abbreviation such as in 'inc." I need to escape the period using "\" correct?
0
ddrudikCommented:
Yes, but note that . is in \W and \b would allow\w following to match.

Given
\binc\.\b

would match:
test inc.
test inc.a

but not:
test inc.,
0
ddrudikCommented:
Also, note that \binc\.\b would not match "test inc. something" given that "." and " " are both in \W.
0
futr_visionAuthor Commented:
Looking at this it is probably not necessary to account for the "." since "." will not return any results in  a search. I'll go with your solution as-is. Thanks
0
ddrudikCommented:
Thanks for the question and the points.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Servers

From novice to tech pro — start learning today.