Solved

Replace strings found in SQL database with replacement synonyms

Posted on 2012-12-26
8
525 Views
Last Modified: 2012-12-27
<cfquery name="qNews" datasource="QNEWS">
  select top 1 story_body,storyID
  from story
  where StoryID = 4886
</cfquery>

<cfquery name="MatchInfo" datasource="FOOTER">
  select *
  from words
</cfquery>
  
<cfloop query="MatchInfo">

<cfif FindNoCase("MatchInfo.root",  qNews.story_body)>
 <cfset TheStory = ReplaceNoCase(qNews.story_body, " #MatchInfo.root# ", " #qNews.story_body# ", "All")>
 <cfelse>
  <cfset TheStory = qNews.story_body>
</cfif> 
 
</cfloop>

<cfoutput>
#ParagraphFormat(TheStory)#
</cfoutput>

Open in new window


I'm attempting to parse a page of text stored in an sql database and replace certain strings with other matching words.

FOOTER is a database of a couple thousand words, each with a matching synonym. The sql 2000 database contains two rows, ROOT, and S1. ROOT is the common word and S1 is the synonym for that word.

I'm attempting to parse through the text in "qNews.story_body" but the data type is TEXT and I guess one cannot search and parse though that text?

What I need help with is making certain my query and loop would be correct if I was using VARCHAR data type, assuming I will be using a varchar data type instead of the problematic TEXT data type.

Once I get that code working correctly, I'll worry about the problem with the TEXT data type.

I can always copy the text elsewhere, temp table, hard drive, etc, and work with it. But obviously, I'd rather not. I will need some ideas where or how.

I'm using SQL 2000 and we will not upgrade so please don't ask. Other than upgrading or converting the data to VARCHAR (which I cannot because story_body is all too often larger than 8000 characters), I'll need to find a way to copy, manipulate, or otherwise get this concept to work. But let's not worry about this part until I have the code snippet I included written correctly or someone tells me my code should work as is, if I was using VARCHAR.
0
Comment
Question by:Qsorb
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 28

Expert Comment

by:Ryan McCauley
ID: 38724347
Though you can't do a standard REPLACE on a TEXT field, you can use PATINDEX to find instances of your words, and then use the UPDATETEXT function to replace values (the per this example):

http://blogs.x2line.com/al/archive/2008/05/03/3417.aspx

Would that meet your needs? I bit cumbersome, but if you're stuck on SQL 2000 (and so stuck with the less-than-ideal TEXT type), you don't have a lot of options.

I know you can't upgrade, but do you have access to a SQL 2005+ server on the network that could link to your SQL 2000 server? If you do, you could set up a linked server, pull the TEXT value in question into a VARCHAR(MAX) value, and then perform your manipulation on the newer server. Again, not sure if it's a possibility, but wanted to mention it.
0
 
LVL 36

Expert Comment

by:SidFishes
ID: 38724878
hmmmm.... not sure what the issue is. You are setting a cf variable from a TEXT field and running the replace -on the CF variable- What the DB can or can't do is irrelevant at this point. CF doesn't have an 8000 char limit so there shouldn't be no problem.

from docs...
"Strings can be of any length, limited by the amount of available memory on the ColdFusion Server. There is, however, a 64K limit on the size of text data that can be read from and written to a ColdFusion database or HTML text area. The ColdFusion Administrator lets you increase the limit for database string transfers, but doing so can reduce server performance. To change the limit, select the Enable retrieval of long text option on the CF Settings page for the data source"

As far as I can tell, your code should work... If you want to save the data back to the source db, you'd just use an update query on the modified CF variable...
0
 
LVL 36

Expert Comment

by:SidFishes
ID: 38724915
and just to clarify - CF has only ONE type of text variable and that's STRING. It's only when interacting with a DB that it matters...which is why we use cfqueryparam to bind datatypes
0
Business Impact of IT Communications

What are the business impacts of how well businesses communicate during an IT incident? Targeting, speed, and transparency all matter. Find out more in this infographic.

 
LVL 52

Accepted Solution

by:
_agx_ earned 500 total points
ID: 38725528
The sql 2000 database contains two rows, ROOT, and S1. ROOT is the common word and S1 is the synonym for that word.

You mean two columns, right?  Since you're doing multiple replacements, you need to perform replace on #TheStory# not #qNews.story_body#. Otherwise, the previous replace results are discarding each time you loop.  Here's a working example:

<!--- initialize variable _before_ the loop --->
<cfset TheStory = qNews.story_body>
<cfloop query="MatchInfo">
      <cfif FindNoCase(MatchInfo.root, TheStory)>
                 <!--- replace "root" value with "s1" value --->
             <cfset TheStory = ReplaceNoCase(TheStory, MatchInfo.root, MatchInfo.s1, "All")>
      </cfif>
</cfloop>

<cfoutput>
     #ParagraphFormat(TheStory)#
</cfoutput>

Keep in mind simple string replacements do not care about whole words. So it can match more than you want it to. A replace of "fuse" will also pick up words that contain those letters like "confused" and "defused". If you only want to replace whole words you'll need a regex. I'm *not* great with those... but IF your synonyms don't contain any special characters you could use "\b" (whole word boundary). This *should*  replace whole words only:

     ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")
0
 

Author Closing Comment

by:Qsorb
ID: 38725808
I tied TONS of ways but you made it look simple.
There's no way around it, I must be a SimpleTon.

> Keep in mind simple string replacements do not care about whole words.

So true!

Using a word such a "agree" then "agreed" might be replaced with the synonym "concur" but with the lingering "d". It would end up looking like, "concurd".
We could leave them for another question coming soon, if the /B does not work, unless you have a ready solution off the top of your head:

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character? If I change ReplaceNoCase to Replace, then the capitalized word is missed.

First I'll experiment with your \b suggestion. It should work.
0
 
LVL 52

Expert Comment

by:_agx_
ID: 38725957
We could leave them for another question coming soon, if the /B does not work

It'll work as long as the synonyms don't contain special characters. If they do, you'd have to escape them. Since they're words, hopefully there won't be many. But unfortunately some common characters have special meaning, like a period "."

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character?

That one's beyond my regex skills.  A regex guru might know a trick - or at least if it's possible. The only thing I can think is an ugly hack. Do two searches. One to find the whole word after a sentence marker like a period "." or "!". Then do

      ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

to find and replace the others.  But I'd only do that as a last resort. Hopefully a regex guy might know of something more elegant than that.
0
 

Author Comment

by:Qsorb
ID: 38726075
No special characters. It's working very well using

 ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

What a difference this will make. You saved me lots of headache. Thanks so much.
0
 
LVL 52

Expert Comment

by:_agx_
ID: 38726095
Great, glad I could help with this one!
0

Featured Post

Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Sometimes databases have MILLIONS of records and we need a way to quickly query that table to return the results me need. Sure you could use CFQUERY but it takes too long when there are millions of records. That is why SOLR was invented. Please …
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question