Searching two database columns for similarities

Hi Experts,

Quite simply, I'm looking for a way to compare a particular column in Table 1 with a particular column in Table 2.

No, I'm not crazy: here's a better way of explaining it:

Table 1 contains knowledgebase articles. The column I want to use as my search parameters for searching Table 2 is called 'Description'.
Table 2 contains a list of 'resolutions', or solutions to a problem.

My problem is that it's easy to use LIKE, SOUNDEX or DIFFERENCE on any one term, but if the description is a long string, how can I query Table 2 for the closest matching solution?

For example, a typical case might be:

TABLE 1 DESCRIPTION:
 Today, my printer faield. In fact, it failed twice. I've checked the toner and power cable and both are OK. Can you help?

TABLE 2's RECORDS:
  Printers need regular topping up with Toner. If yours has run out, contact IT Support to get some.
  Has your printer stopped working? Check the printer cables are plugged in!
  If your mouse sticks, give it a clean. Sometimes dirt causes it to function improperly.


SO....................

Can we use each word individually in TABLE 1 DESCRIPTION to search through all the descriptions in TABLE 2, ranking by the most relevant first?

Cheers!
OnError_FixAsked:
Who is Participating?
 
Eugene ZCommented:
check Red-Gate SQL Data Compare
 utility /Download a 14-day, fully functional, free trial./
http://www.red-gate.com/sql/SQL_Data_Compare_facts.htm
---------------------
Another way -> bcp in text files each column and compare files -
google search for compare text files:
http://www.google.com/search?as_q=&num=10&hl=en&btnG=Google+Search&as_epq=compare+text+files&as_oq=&as_eq=&lr=&as_ft=i&as_filetype=&as_qdr=all&as_occt=any&as_dt=i&as_sitesearch=&safe=images
0
 
rafranciscoCommented:
You may look into Full Text Seach for this.  I have not worked with it but looks like a good candidate.

The hard part with this is having "each word individually search through all the descriptions".  First you have to have a function or stored procedure that will parse Table 1 Description and split it into individual words.  Then from this list, you have to get rid of common words such as "a", "the","it" and so forth.  I suggest creating a table that will list words that you don't want included in your search.

Once you have a list of words to search for in Table 2, then you have to process each word and count the number of times it appeared in the Description of Table 2.  This can easily be done by a user-defined function.
0
 
OnError_FixAuthor Commented:
Hello,

Thanks for your help. Not sure full-text is the route I want to go down, to be honest. However, I'd be interested in a sample of the user-defined function you mentioned?

Thanks
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
rafranciscoCommented:
Here's a user-defined function that will return the number of times a search string appears in the given input:

CREATE FUNCTION [dbo].[ufn_CountString] ( @pInput VARCHAR(8000), @pSearchString VARCHAR(100) )
RETURNS INT
BEGIN

RETURN (LEN(@pInput) - LEN(REPLACE(@pInput, @pSearchString, ''))) / LEN(@pSearchString)

END
GO

Here's an example on how to use it:

DECLARE @vInput           VARCHAR(8000)
DECLARE @vSearchString    VARCHAR(50)

SET @vInput = 'THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG'
SET @vSearchString = 'THE'

SELECT dbo.ufn_CountString ( @vInput, @vSearchstring)

This returns 2 because the word "THE" occurs 2 times in the input.
0
 
OnError_FixAuthor Commented:
Fantastic function,

The problem I have now is how to count each word in the given Update ID, to return a set of results from Table 2 but with their rank caluclated in a single SUM column.

I.E.

@SearchKeywords = ('Quick', 'Fox')

Table 1                             Table 2
QUICK BROWN FOX...         1 + 1 = 2
JONNY FOX                        1 = 1

Any ideas how I can accomplish this?

I have written a vb.net function to filter out all the 'common' words, and generate a string of keywords for the remaining parts. Now all I gotta do is put SQL to work!

Thanks
0
 
rafranciscoCommented:
I suggest looping through your search key words and loading the results into a temporary table then you can do a SUM on the temporary table.  It is something like this:

For each word in your word list
BEGIN
    INSERT INTO YourTempTable (Table2ID, SearchWord, WordCount)
    SELECT ID, @SearchWord, dbo.ufn_CountString ( Description, @SearchWord )
    FROM Table2

    Get Next Word
END

SELECT Table2ID, SUM(WordCount) AS WordCount
FROM YourTempTable
GROUP BY Table2ID
ORDER BY SUM(WordCount) DESC
0
 
rafranciscoCommented:
OnError_Fix, were you able to solve this problem in any way?
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.