Link to home
Start Free TrialLog in
Avatar of Qsorb
QsorbFlag for United States of America

asked on

Replace strings found in SQL database with replacement synonyms

<cfquery name="qNews" datasource="QNEWS">
  select top 1 story_body,storyID
  from story
  where StoryID = 4886

<cfquery name="MatchInfo" datasource="FOOTER">
  select *
  from words
<cfloop query="MatchInfo">

<cfif FindNoCase("MatchInfo.root",  qNews.story_body)>
 <cfset TheStory = ReplaceNoCase(qNews.story_body, " #MatchInfo.root# ", " #qNews.story_body# ", "All")>
  <cfset TheStory = qNews.story_body>


Open in new window

I'm attempting to parse a page of text stored in an sql database and replace certain strings with other matching words.

FOOTER is a database of a couple thousand words, each with a matching synonym. The sql 2000 database contains two rows, ROOT, and S1. ROOT is the common word and S1 is the synonym for that word.

I'm attempting to parse through the text in "qNews.story_body" but the data type is TEXT and I guess one cannot search and parse though that text?

What I need help with is making certain my query and loop would be correct if I was using VARCHAR data type, assuming I will be using a varchar data type instead of the problematic TEXT data type.

Once I get that code working correctly, I'll worry about the problem with the TEXT data type.

I can always copy the text elsewhere, temp table, hard drive, etc, and work with it. But obviously, I'd rather not. I will need some ideas where or how.

I'm using SQL 2000 and we will not upgrade so please don't ask. Other than upgrading or converting the data to VARCHAR (which I cannot because story_body is all too often larger than 8000 characters), I'll need to find a way to copy, manipulate, or otherwise get this concept to work. But let's not worry about this part until I have the code snippet I included written correctly or someone tells me my code should work as is, if I was using VARCHAR.
Avatar of Ryan McCauley
Ryan McCauley
Flag of United States of America image

Though you can't do a standard REPLACE on a TEXT field, you can use PATINDEX to find instances of your words, and then use the UPDATETEXT function to replace values (the per this example):

Would that meet your needs? I bit cumbersome, but if you're stuck on SQL 2000 (and so stuck with the less-than-ideal TEXT type), you don't have a lot of options.

I know you can't upgrade, but do you have access to a SQL 2005+ server on the network that could link to your SQL 2000 server? If you do, you could set up a linked server, pull the TEXT value in question into a VARCHAR(MAX) value, and then perform your manipulation on the newer server. Again, not sure if it's a possibility, but wanted to mention it.
hmmmm.... not sure what the issue is. You are setting a cf variable from a TEXT field and running the replace -on the CF variable- What the DB can or can't do is irrelevant at this point. CF doesn't have an 8000 char limit so there shouldn't be no problem.

from docs...
"Strings can be of any length, limited by the amount of available memory on the ColdFusion Server. There is, however, a 64K limit on the size of text data that can be read from and written to a ColdFusion database or HTML text area. The ColdFusion Administrator lets you increase the limit for database string transfers, but doing so can reduce server performance. To change the limit, select the Enable retrieval of long text option on the CF Settings page for the data source"

As far as I can tell, your code should work... If you want to save the data back to the source db, you'd just use an update query on the modified CF variable...
and just to clarify - CF has only ONE type of text variable and that's STRING. It's only when interacting with a DB that it matters...which is why we use cfqueryparam to bind datatypes
Avatar of _agx_
Flag of United States of America image

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Qsorb


I tied TONS of ways but you made it look simple.
There's no way around it, I must be a SimpleTon.

> Keep in mind simple string replacements do not care about whole words.

So true!

Using a word such a "agree" then "agreed" might be replaced with the synonym "concur" but with the lingering "d". It would end up looking like, "concurd".
We could leave them for another question coming soon, if the /B does not work, unless you have a ready solution off the top of your head:

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character? If I change ReplaceNoCase to Replace, then the capitalized word is missed.

First I'll experiment with your \b suggestion. It should work.
We could leave them for another question coming soon, if the /B does not work

It'll work as long as the synonyms don't contain special characters. If they do, you'd have to escape them. Since they're words, hopefully there won't be many. But unfortunately some common characters have special meaning, like a period "."

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character?

That one's beyond my regex skills.  A regex guru might know a trick - or at least if it's possible. The only thing I can think is an ugly hack. Do two searches. One to find the whole word after a sentence marker like a period "." or "!". Then do

      ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

to find and replace the others.  But I'd only do that as a last resort. Hopefully a regex guy might know of something more elegant than that.
Avatar of Qsorb


No special characters. It's working very well using

 ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

What a difference this will make. You saved me lots of headache. Thanks so much.
Great, glad I could help with this one!