Solved

How to get all hits with apostrophe and colon? Example: Andersons, Anderson's, Anderson:s, Andersons'

Posted on 2008-10-23
8
428 Views
Last Modified: 2012-06-27
In Sweden we have no special genitive characters i.e. Anderson's cat is just written Andersons cat (or Andersons katt in Swedish). But some companies write their names with apostrophe ' and sometimes even colons i.e. Anderson's or Anderson:s (or Mc Donald's for example).

In our company name SQL query we use a combination of ordinary search and full text search with the "contains("formsof thesaurus")". When searching for andersons, no hits for anderson's or anderson:s are returned and this is our problem. Likewise when search for example anderson's, no hits for andersons (or anderson:s) are returned.

I have tried to set this in the thesaurus file but it seems to just consider whole words and it is not working for example to set <sub>s</sub>      <sub>'s</sub>       <sub>:s</sub>.

One way would be to check the incoming search word(s) programatically for apostrophes or colons and do a search for each. That is; if @searchWord= Anderson's then search for Anderson's then replace the apostrophe to a colon and search for Anderson:s and then remove the colon to search for Andersons but this doesn't seem to be a good solution performance wise...?

Another way would be to put <sub>Andersons</sub>        <sub>Anderson's</sub>         <subAnderson:s></sub>    in the full text search thesaurus file but this is not a realistic solution since there are so many variants; Svenssons, Peters, Harrys etc. etc. to be set in that file.

How do I solve this problem?
0
Comment
Question by:ClaburP
  • 4
  • 2
8 Comments
 
LVL 59

Accepted Solution

by:
Kevin Cross earned 500 total points
Comment Utility
Something like this may work.  You can tweak to your needs.  The point is you can do replace on both down to Andersons without any puntuation and then compare.

If the replacements are more complex, you can create a User Defined Function that does the replacement and returns clean string that you can use in the comparisons.

Good luck.
SELECT *

FROM TableName

WHERE REPLACE(REPLACE(CompanyName, ':', ''), '''', '') LIKE '%' + REPLACE(REPLACE(@SearchWord, ':', ''), '''', '') + '%'

Open in new window

0
 

Author Comment

by:ClaburP
Comment Utility
Thanks! That really works exactly as I want. But I also would like to incorporate this into a full text search.
The query now is like below:

set @tempVadvar='FORMSOF(THESAURUS, " ' + @searchWord + ' ")'
SELECT * FROM CompanyTable WHERE CONTAINS (CompanyName, @tempVadVar)

I have tried to apply your solution and still be able to use the full text search index, but quite not succeeded with that.

It would be much appreciated if you have any input on how to do that.

Thankyou

0
 
LVL 59

Assisted Solution

by:Kevin Cross
Kevin Cross earned 500 total points
Comment Utility
I think the problem is that the REPLACE method is comparing the data after the replace has happened for both pieces of data whereas in the CONTAINS I think it will ultimately look at the original data which may be the one that has ' or : so REPLACE on the @searchWord isn't as effective and CompanyName in the CONTAINS statement must be a column name and not a formula.

{I don't recall ever trying this on a view, so forgive me if doesn't work as I don't have SQL to test right now.}

+Create a VIEW of CompanyTable which shows all the same data as CompanyTable but with CompanyName cleaned.  (e.g. SELECT *, REPLACE(REPLACE(CompanyName, ':', ''), '''', '') AS CompanyNameClean FROM CompanyTable)
+Alter your query to first clean the @searchWord before it is used and also use the VIEW for full text search:

SET @searchWord = REPLACE(REPLACE(@searchWord, ':', ''), '''', '')
SET @tempVadvar = 'FORMSOF(THESAURUS, " ' + @searchWord + ' ")'
SELECT * FROM CompanyView WHERE CONTAINS (CompanyNameClean, @tempVadVar)

0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:ClaburP
Comment Utility
Ok, thanks again! Well there are some problems with that solution. Firstly, our query is within a stored procedure so I can't use a view. But of course I could create a table variable or temporary table instead. But then the new problem is that I can not use the full text search with FORMSOF(THESAURUS) on the @table variable since it is not full text indexed.

One solution would be to create a new column "StrippedCompanyName" in the static CompanyTable with the stripped company names in it, and then do the full text search on that column.
But I was hoping there is better and smoother solution than that... :-)
0
 

Author Comment

by:ClaburP
Comment Utility
Hello. This topic is not abandoned but I think I only have got a partial solution in the answers. That is why I have doubts in marking it accepted solution. Is there a button "Partially solved"...?
0
 

Author Closing Comment

by:ClaburP
Comment Utility
Only partially solved. Not solved when using with Fulltext search.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
>Is there a button "Partially solved"...?
no.

from my experience, the only "solution" is to have 2 columns:
1 with the original value
1 with the value with the ":" and the "'" removed (possibly in a dedicated table...)

and using the fulltext on that second column (and not the original one)
it will use twice the space, but help to solve this and similar problems...
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Let's review the features of new SQL Server 2012 (Denali CTP3). It listed as below: PERCENT_RANK(): PERCENT_RANK() function will returns the percentage value of rank of the values among its group. PERCENT_RANK() function value always in be…
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now