Avatar of Qsorb
Qsorb
Flag for United States of America asked on

Replace strings found in SQL database with replacement synonyms

<cfquery name="qNews" datasource="QNEWS">
  select top 1 story_body,storyID
  from story
  where StoryID = 4886
</cfquery>

<cfquery name="MatchInfo" datasource="FOOTER">
  select *
  from words
</cfquery>
  
<cfloop query="MatchInfo">

<cfif FindNoCase("MatchInfo.root",  qNews.story_body)>
 <cfset TheStory = ReplaceNoCase(qNews.story_body, " #MatchInfo.root# ", " #qNews.story_body# ", "All")>
 <cfelse>
  <cfset TheStory = qNews.story_body>
</cfif> 
 
</cfloop>

<cfoutput>
#ParagraphFormat(TheStory)#
</cfoutput>

Open in new window


I'm attempting to parse a page of text stored in an sql database and replace certain strings with other matching words.

FOOTER is a database of a couple thousand words, each with a matching synonym. The sql 2000 database contains two rows, ROOT, and S1. ROOT is the common word and S1 is the synonym for that word.

I'm attempting to parse through the text in "qNews.story_body" but the data type is TEXT and I guess one cannot search and parse though that text?

What I need help with is making certain my query and loop would be correct if I was using VARCHAR data type, assuming I will be using a varchar data type instead of the problematic TEXT data type.

Once I get that code working correctly, I'll worry about the problem with the TEXT data type.

I can always copy the text elsewhere, temp table, hard drive, etc, and work with it. But obviously, I'd rather not. I will need some ideas where or how.

I'm using SQL 2000 and we will not upgrade so please don't ask. Other than upgrading or converting the data to VARCHAR (which I cannot because story_body is all too often larger than 8000 characters), I'll need to find a way to copy, manipulate, or otherwise get this concept to work. But let's not worry about this part until I have the code snippet I included written correctly or someone tells me my code should work as is, if I was using VARCHAR.
ColdFusion LanguageMicrosoft SQL Server

Avatar of undefined
Last Comment
_agx_

8/22/2022 - Mon
Ryan McCauley

Though you can't do a standard REPLACE on a TEXT field, you can use PATINDEX to find instances of your words, and then use the UPDATETEXT function to replace values (the per this example):

http://blogs.x2line.com/al/archive/2008/05/03/3417.aspx

Would that meet your needs? I bit cumbersome, but if you're stuck on SQL 2000 (and so stuck with the less-than-ideal TEXT type), you don't have a lot of options.

I know you can't upgrade, but do you have access to a SQL 2005+ server on the network that could link to your SQL 2000 server? If you do, you could set up a linked server, pull the TEXT value in question into a VARCHAR(MAX) value, and then perform your manipulation on the newer server. Again, not sure if it's a possibility, but wanted to mention it.
SidFishes

hmmmm.... not sure what the issue is. You are setting a cf variable from a TEXT field and running the replace -on the CF variable- What the DB can or can't do is irrelevant at this point. CF doesn't have an 8000 char limit so there shouldn't be no problem.

from docs...
"Strings can be of any length, limited by the amount of available memory on the ColdFusion Server. There is, however, a 64K limit on the size of text data that can be read from and written to a ColdFusion database or HTML text area. The ColdFusion Administrator lets you increase the limit for database string transfers, but doing so can reduce server performance. To change the limit, select the Enable retrieval of long text option on the CF Settings page for the data source"

As far as I can tell, your code should work... If you want to save the data back to the source db, you'd just use an update query on the modified CF variable...
SidFishes

and just to clarify - CF has only ONE type of text variable and that's STRING. It's only when interacting with a DB that it matters...which is why we use cfqueryparam to bind datatypes
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
ASKER CERTIFIED SOLUTION
_agx_

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Qsorb

ASKER
I tied TONS of ways but you made it look simple.
There's no way around it, I must be a SimpleTon.

> Keep in mind simple string replacements do not care about whole words.

So true!

Using a word such a "agree" then "agreed" might be replaced with the synonym "concur" but with the lingering "d". It would end up looking like, "concurd".
We could leave them for another question coming soon, if the /B does not work, unless you have a ready solution off the top of your head:

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character? If I change ReplaceNoCase to Replace, then the capitalized word is missed.

First I'll experiment with your \b suggestion. It should work.
_agx_

We could leave them for another question coming soon, if the /B does not work

It'll work as long as the synonyms don't contain special characters. If they do, you'd have to escape them. Since they're words, hopefully there won't be many. But unfortunately some common characters have special meaning, like a period "."

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character?

That one's beyond my regex skills.  A regex guru might know a trick - or at least if it's possible. The only thing I can think is an ugly hack. Do two searches. One to find the whole word after a sentence marker like a period "." or "!". Then do

      ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

to find and replace the others.  But I'd only do that as a last resort. Hopefully a regex guy might know of something more elegant than that.
Qsorb

ASKER
No special characters. It's working very well using

 ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

What a difference this will make. You saved me lots of headache. Thanks so much.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
_agx_

Great, glad I could help with this one!