Solved

Replace strings found in SQL database with replacement synonyms

Posted on 2012-12-26
8
501 Views
Last Modified: 2012-12-27
<cfquery name="qNews" datasource="QNEWS">
  select top 1 story_body,storyID
  from story
  where StoryID = 4886
</cfquery>

<cfquery name="MatchInfo" datasource="FOOTER">
  select *
  from words
</cfquery>
  
<cfloop query="MatchInfo">

<cfif FindNoCase("MatchInfo.root",  qNews.story_body)>
 <cfset TheStory = ReplaceNoCase(qNews.story_body, " #MatchInfo.root# ", " #qNews.story_body# ", "All")>
 <cfelse>
  <cfset TheStory = qNews.story_body>
</cfif> 
 
</cfloop>

<cfoutput>
#ParagraphFormat(TheStory)#
</cfoutput>

Open in new window


I'm attempting to parse a page of text stored in an sql database and replace certain strings with other matching words.

FOOTER is a database of a couple thousand words, each with a matching synonym. The sql 2000 database contains two rows, ROOT, and S1. ROOT is the common word and S1 is the synonym for that word.

I'm attempting to parse through the text in "qNews.story_body" but the data type is TEXT and I guess one cannot search and parse though that text?

What I need help with is making certain my query and loop would be correct if I was using VARCHAR data type, assuming I will be using a varchar data type instead of the problematic TEXT data type.

Once I get that code working correctly, I'll worry about the problem with the TEXT data type.

I can always copy the text elsewhere, temp table, hard drive, etc, and work with it. But obviously, I'd rather not. I will need some ideas where or how.

I'm using SQL 2000 and we will not upgrade so please don't ask. Other than upgrading or converting the data to VARCHAR (which I cannot because story_body is all too often larger than 8000 characters), I'll need to find a way to copy, manipulate, or otherwise get this concept to work. But let's not worry about this part until I have the code snippet I included written correctly or someone tells me my code should work as is, if I was using VARCHAR.
0
Comment
Question by:Qsorb
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 28

Expert Comment

by:Ryan McCauley
ID: 38724347
Though you can't do a standard REPLACE on a TEXT field, you can use PATINDEX to find instances of your words, and then use the UPDATETEXT function to replace values (the per this example):

http://blogs.x2line.com/al/archive/2008/05/03/3417.aspx

Would that meet your needs? I bit cumbersome, but if you're stuck on SQL 2000 (and so stuck with the less-than-ideal TEXT type), you don't have a lot of options.

I know you can't upgrade, but do you have access to a SQL 2005+ server on the network that could link to your SQL 2000 server? If you do, you could set up a linked server, pull the TEXT value in question into a VARCHAR(MAX) value, and then perform your manipulation on the newer server. Again, not sure if it's a possibility, but wanted to mention it.
0
 
LVL 36

Expert Comment

by:SidFishes
ID: 38724878
hmmmm.... not sure what the issue is. You are setting a cf variable from a TEXT field and running the replace -on the CF variable- What the DB can or can't do is irrelevant at this point. CF doesn't have an 8000 char limit so there shouldn't be no problem.

from docs...
"Strings can be of any length, limited by the amount of available memory on the ColdFusion Server. There is, however, a 64K limit on the size of text data that can be read from and written to a ColdFusion database or HTML text area. The ColdFusion Administrator lets you increase the limit for database string transfers, but doing so can reduce server performance. To change the limit, select the Enable retrieval of long text option on the CF Settings page for the data source"

As far as I can tell, your code should work... If you want to save the data back to the source db, you'd just use an update query on the modified CF variable...
0
 
LVL 36

Expert Comment

by:SidFishes
ID: 38724915
and just to clarify - CF has only ONE type of text variable and that's STRING. It's only when interacting with a DB that it matters...which is why we use cfqueryparam to bind datatypes
0
 
LVL 52

Accepted Solution

by:
_agx_ earned 500 total points
ID: 38725528
The sql 2000 database contains two rows, ROOT, and S1. ROOT is the common word and S1 is the synonym for that word.

You mean two columns, right?  Since you're doing multiple replacements, you need to perform replace on #TheStory# not #qNews.story_body#. Otherwise, the previous replace results are discarding each time you loop.  Here's a working example:

<!--- initialize variable _before_ the loop --->
<cfset TheStory = qNews.story_body>
<cfloop query="MatchInfo">
      <cfif FindNoCase(MatchInfo.root, TheStory)>
                 <!--- replace "root" value with "s1" value --->
             <cfset TheStory = ReplaceNoCase(TheStory, MatchInfo.root, MatchInfo.s1, "All")>
      </cfif>
</cfloop>

<cfoutput>
     #ParagraphFormat(TheStory)#
</cfoutput>

Keep in mind simple string replacements do not care about whole words. So it can match more than you want it to. A replace of "fuse" will also pick up words that contain those letters like "confused" and "defused". If you only want to replace whole words you'll need a regex. I'm *not* great with those... but IF your synonyms don't contain any special characters you could use "\b" (whole word boundary). This *should*  replace whole words only:

     ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Closing Comment

by:Qsorb
ID: 38725808
I tied TONS of ways but you made it look simple.
There's no way around it, I must be a SimpleTon.

> Keep in mind simple string replacements do not care about whole words.

So true!

Using a word such a "agree" then "agreed" might be replaced with the synonym "concur" but with the lingering "d". It would end up looking like, "concurd".
We could leave them for another question coming soon, if the /B does not work, unless you have a ready solution off the top of your head:

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character? If I change ReplaceNoCase to Replace, then the capitalized word is missed.

First I'll experiment with your \b suggestion. It should work.
0
 
LVL 52

Expert Comment

by:_agx_
ID: 38725957
We could leave them for another question coming soon, if the /B does not work

It'll work as long as the synonyms don't contain special characters. If they do, you'd have to escape them. Since they're words, hopefully there won't be many. But unfortunately some common characters have special meaning, like a period "."

Do you have a method to preserve the first word at the beginning of each sentence, that is, preserve its capitalized first character?

That one's beyond my regex skills.  A regex guru might know a trick - or at least if it's possible. The only thing I can think is an ugly hack. Do two searches. One to find the whole word after a sentence marker like a period "." or "!". Then do

      ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

to find and replace the others.  But I'd only do that as a last resort. Hopefully a regex guy might know of something more elegant than that.
0
 

Author Comment

by:Qsorb
ID: 38726075
No special characters. It's working very well using

 ReReplaceNoCase(TheStory, "(\b"& MatchInfo.root &"\b)", MatchInfo.s1, "All")

What a difference this will make. You saved me lots of headache. Thanks so much.
0
 
LVL 52

Expert Comment

by:_agx_
ID: 38726095
Great, glad I could help with this one!
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Join & Write a Comment

Introduction In my previous article (http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/A_9150-Loading-XML-Using-SSIS.html) I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
Sometimes databases have MILLIONS of records and we need a way to quickly query that table to return the results me need. Sure you could use CFQUERY but it takes too long when there are millions of records. That is why SOLR was invented. Please …
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now