[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now


Newbie Full Text Query Question - Strange Ranking Results?

Posted on 2004-07-31
Medium Priority
Last Modified: 2008-02-01

I am trying to speed up our very antiquated search facility on our site using FTS. I just need a few pointers.

I am struggling to produce accurate results on some multiple word seach criteria.

For speed i have a lookup table that is produced from about 10 tables that conatin all text that i want to search on. I use a simple FREETEXTTABLE(SearchTable, *, N'TextStringToSearch')  search on this, which then allows me to use the identities returned together with the rank to get the correct records ranked according to the FREETEXTTABLE results. This is fine and extremly fast. The problem is when using multiple words, for example 'New Zealand'. (I have a country name field) The result will bring back a dozen or so 'New Zealand' reults, then bring back 'New' Then 'Zealand' then continue to giving a low ranking score to 'New Zealand' again. Obviously i cannot tell when a user wants grouping and when they dont, this gets worse when you saerch for 'New Zealand Beaches' For example, is comes back with beaches in other countries first!

Is this normal behaviour? Is there a word proximity thing i can use? Can I weight the columns being searched? Am i completly off track? I dont need google accuracy, but it sure needs to be relevant and quick. Any pointers?

Question by:gravity100
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
LVL 75

Accepted Solution

Anthony Perkins earned 500 total points
ID: 11684973
Since you cannot search for phrases (I know New Zealand is not a phrase, but just bare with me ...) with FREETEXTTABLE, consider using instead CONTAINSTABLE which does support it.  You can then do something like this:
CONTAINSTABLE(SearchTable, *, N'"New Zealand"')  
LVL 75

Expert Comment

by:Anthony Perkins
ID: 11684981
Also, just a friendly reminder, of how it works here:
What are my choices?

Author Comment

ID: 11685184
Yes that is usefull, but as i see it that will exclude 'new' and 'zealand' something we dont want to do. I guess i could combine both FREETEXTTABLE  and CONTAINSTABLE, that would give a more accurate result, but it seems overly expensive and wouldnt solve the problem very well. How does the NEAR term work, it doesnt seem to regard text in different fields as near, how near is near? I could concatanate all the search into one field if you think that would work. I currently have the following

SELECT SL.Country_Code, SL.Hotel_Name, SL.Hotel_Code,SL.Currency_Rate,SL.Rate_Value, A.Rank
FROM   SearchLookup AS SL
INNER JOIN CONTAINSTABLE(SearchLookup, *, '(new) OR (zealand) OR (new NEAR zealand) OR ("new zealand")')  AS A ON SL.ID = A.[KEY]
GROUP BY  SL.Country_Code, SL.Hotel_Name, SL.Hotel_Code,SL.Currency_Rate,SL.Rate_Value, A.Rank
ORDER BY A.Rank Desc

..this sql brings up New Delhi as the highest rank...

Am i missing something?
Survive A High-Traffic Event with Percona

Your application or website rely on your database to deliver information about products and services to your customers. You can’t afford to have your database lose performance, lose availability or become unresponsive – even for just a few minutes.

LVL 75

Expert Comment

by:Anthony Perkins
ID: 11685288
>>but as i see it that will exclude 'new' and 'zealand' something we dont want to do.<<
Than I guess I am not following you.  If your search has "new" on its own, and New Delhi is the highest than I can only assume that it must be in your database more times than others.  In other words the way, that I would expect the following search string to work:
'(new) OR (zealand) OR (new NEAR zealand) OR ("new zealand")'

Is to give equal importance to each of the phrases delimited by OR.  So it would seem quite plausible that New Delhi could show up as the highest rank.

As far as the NEAR is concerned, this is from BOL:
NEAR | ~
Indicates that the word or phrase on the left side of the NEAR or ~ operator should be approximately close to the word or phrase on the right side of the NEAR or ~ operator. Multiple proximity terms can be chained, for example:

This means that word or phrase a should be near word or phrase b, which should be near word or phrase c.

Microsoft® SQL Server™ ranks the distance between the left and right word or phrase. A low rank value (for example, 0) indicates a large distance between the two. If the specified words or phrases are far apart from each other, the query is considered to be satisfied; however, the query has a very low (0) rank value. However, if <contains_search_condition> consists of only one or more NEAR proximity terms, SQL Server does not return rows with a rank value of 0. For more information about ranking, see CONTAINSTABLE.

Author Comment

ID: 11685325
Yeah exactly, 'far apart' , whats that supposed to mean, no help MS. I will try concatanating all the searched fields into one field and se how that fairs...

" If the specified words or phrases are far apart from each other, the query is considered to be satisfied" - Not across fields by the looks of it..

Author Comment

ID: 11685638
I only want to use NEARs, i dont want any ORs in the search, that condition can never bring back accurate results.The OR in my example was just iterating the point. I want all results, but 'New Zealand' has to come before 'New' or 'Zealand'.

I have just concatinated all the search fields into one field, I am using a CONTAINSTABLE and a NEAR term on this table, much better results are being returned thanks, although that still does not solve the initial problem, 'New' and 'zealand' results are not being returned.

"if <contains_search_condition> consists of only one or more NEAR proximity terms, SQL Server does not return rows with a rank value of 0"

I guess i'll have to ask the user to broaden there search criteria if this happens.


Author Comment

ID: 11686860

Just to close this thread with the implemented solution:

1. Parse search phrase into words (removing noise words - still to do)
2. Build string from words (ord1 NEAR word2 NEAR word3)
3. Search using: CONTAINSTABLE and NEAR (word1 NEAR word2 NEAR word3 etc..) on whole lookup table
4. Return in rank order

This produces by far the best results

the FREETEXTTABLE does not appear to be very useful for accurate results on free form search text.

" If the specified words or phrases are far apart from each other, the query is considered to be satisfied" - Not across fields by the looks of it.. "
This is not true sorry, so much easier in the light of day...

Thanks all
LVL 75

Expert Comment

by:Anthony Perkins
ID: 11688447
You may want to re-read the EE Guidelines regarding grading standards:
What's the right grade to give?

And specifically this section:
C: Because Experts' reliability are often judged by their grading records, many Experts would like the opportunity to clarify if you have questions about their solutions. If you have given the Expert(s) ample time to respond to your clarification posts and you have responded to each of their posts providing requested information; or if the answers, after clarification, lack finality or do not completely address the issue presented, then a "C" grade is an option. You also have the option here of just asking Community Support to delete the question.
Remember, the Expert helping you today is probably going to be helping you next time you post a question. Give them a fair chance to earn an 'Excellent!' grade and they'll provide you with some amazing support. It's also true that a "C" is the lowest grade you can give, and the Experts know that -- so use it judiciously.

I wish you the best of luck.

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Viewers will learn how the fundamental information of how to create a table.

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question