Solved

Using Soundex in SQL Server to match ONE word within string of MANY words.

Posted on 2009-04-10
6
1,508 Views
Last Modified: 2012-05-06
I need to build a query that will return records from a table where a text field, which often consists of a string of several words, contains within it a single word that is similar to some other supplied word.  I know that Soundex is limited, but it is probably good enough for my purposes.  Take for instance the following string:

"This string has several words"

The set of Soundex values for these five words looks like this:

"T200 S365 H200 S164 W632"

...but using the entire text string as the argument for the Soundex function returns only the value of the first word, T200, so I don't have a ready-made function to actually produce the above list of all five Soundex codes for searching.

Again, my goal is to return only those records where one or more words in the entire string contains the same Soundex value as the single word supplied for the search.  In this sample case, if supplied with the word "Haze", which has a Soundex value of H200, I would have a match because the third word, "has", is the same Soundex value.

I am working in SQL Server 2005.
0
Comment
Question by:AutomateMyOffice
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
6 Comments
 
LVL 143

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 500 total points
ID: 24116094
please find the function dbo.ParmsToList on EE, which I use all the time.

for your needs, it could be used like this:


select value,  soundex(value)
  from dbo.parmstolist('This string has several words', ' ') l


in a function, to reconcat the values:

create function dbo.TextToSoundEx(@text varchar(max)) returns varchar(max)
as 
begin 
declare @res varchar(max)
select @res = coalesce(@res + ' ', '') + soundex(value) 
  from dbo.parmstolist('This string has several words', ' ') l 
return @res
end 

or in a "simple" query:

Open in new window

0
 

Author Comment

by:AutomateMyOffice
ID: 24117160
Thanks, angelIII, for your quick response.  I am admittedly a non-expert in SQL Server (much more proficient in Access and working my way up the SQL learning curve) so I have had to scratch my head over your answer a bit, especially since the code snippet seems to end prematurely with a colon.  

The first hurdle was in finding a version of ParmsToList that contained the second argument (delimiter).  There is another version floating around that only has a single argument, and that, of course, did not work very well.

I have been able to run ParmsToList, supplying a string of words as the argument, and getting a table in return, but this does not get me where I want to go.  I will need a little more handholding to get the actual results I am after.  Specifically, I need to start with a table that contains two columns, RowID and RowText, as shown in the code snippet (to retain spacing).  

I need to know if it is possible to construct a query (view), such that if I know I am looking for Soundex matches for the word "Haze", the query will return three rows, each having only the RowID field.  The values returned would be A, B and D, since those rows contain words (has, hose and he's) whose Soundex code (H200) matches that of Haze.

RowID    RowText
-----    -------
A        This string has several words
B        The garden hose is too short
C        This will not match
D        I think he's right

Open in new window

0
 

Author Comment

by:AutomateMyOffice
ID: 24119177
OK.  After a few hours of head scratching I did finally figure out how to do this using your two functions.  Your TextToSoundEx function needed to be modified such that the @text parameter took the place of the string 'This string has several words' as the first argument of ParmsToList, and with that in place, I was able to get the desired output using:

SELECT    RowID
FROM         MyTable
WHERE     (dbo.TextToSoundEx(RowText) LIKE '%' + SOUNDEX('haze') + '%')

It is pretty slow, taking about 15 seconds to return 42 records out of about 14,000, but it works.  Any help on streamlining this would be appreciated, but I am glad to at least have a working system.  Thanks!
0
Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24119309
glad I could help.

for the efficiency part, I would create an additional column in the table, and on insert/update of the record, update the soundex value immediately ...
0
 

Author Comment

by:AutomateMyOffice
ID: 24120817
So you are suggesting a calculated third column in MyTable (say, RowSoundex?) that would contain the Soundex values for the words in RowText, storing "T200 S365 H200 S164 W632" in the same record as "This string has several words"?  I suppose I would add these values now for all existing records, just using an update query, and then create a trigger to manage the new and updated records in the future...  That will be my first trigger!  Any pointers on how to do that?  I'm sure it's covered in BOL, but maybe you have a link to a good beginner's tutorial or something?  Thanks again.
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24121704
yes, exactly.

trigger is simply this:
create trigger trg_update_rowsoundindex
 on yourtable
 for insert, update
as
  update t  
    set RowSoundex = dbo.TextToSoundEx(i.RowText) 
   from yourtable t
   join inserted i
     on i.rowid = t.rowid

Open in new window

0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question