?
Solved

MYSQL FullTextSearch Doesnt Return for all matches

Posted on 2013-01-04
7
Medium Priority
?
471 Views
Last Modified: 2013-01-07
Hi Guys,

I have this query.

SELECT ItemId,MenuId,SubMenuId,Text, PageTitle,
MATCH(Content) AGAINST ('software') AS SearchScore
FROM SubMenuItems WHERE MATCH(Content) AGAINST ('software')
ORDER BY SearchScore DESC

now running it agains a fulltext enabled myisam table.

This is in the content column.

We are located in Central New Jersey. Our support team is also located in Central New Jersey. You will always get a perfect english speaking person on the phone when you call us. We will be able to help you remotely using the software TeamViewer to access your computer and help you with your needs. TeamViewer is extremely safe, and no one can connect to your computer without your permission. You have to initiate a session. Each session has a different random generated password and id, to insure that no one that had access to your computer once, will be able to access it again without new information from you the customer.

The word software is in the text, but is not returned by my query.


if i do this query

SELECT ItemId,MenuId,SubMenuId,Text, PageTitle,
MATCH(Content) AGAINST ('person') AS SearchScore
FROM SubMenuItems WHERE MATCH(Content) AGAINST ('person')
ORDER BY SearchScore DESC

it does return the content. WHY? am i totally missing something here?
0
Comment
Question by:tim_carter
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
7 Comments
 
LVL 24

Accepted Solution

by:
johanntagle earned 2000 total points
ID: 38748221
I'm assuming this is a very small table?  Please see the latter paragraphs at http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html.  To quote from there:

Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Consequently, a word that is present in many documents has a lower weight (and may even have a zero weight), because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row.

Such a technique works best with large collections (in fact, it was carefully tuned this way). For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results.

If you want a simple keyword search use boolean mode see http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html.  Basically your match clause will be something like:

 match(content) against ('+software' in boolean mode);
0
 
LVL 16

Expert Comment

by:theo kouwenhoven
ID: 38750250
Try

WHERE Content LIKE '%software%')
0
 
LVL 1

Author Closing Comment

by:tim_carter
ID: 38751604
I ended up doing like this

SELECT ItemId,MenuId,SubMenuId,Text, PageTitle,
                MATCH(Content, Code, Text, PageTitle) AGAINST ('+".$rSearchFor."' in boolean mode) AS SearchScore
            FROM SubMenuItems
            WHERE
                MATCH(Content, Code, Text, PageTitle) AGAINST ('+".$rSearchFor."' in boolean mode)
                    OR
                MATCH(Content, Code, Text, PageTitle) AGAINST ('".$rSearchFor."')    
                    OR
                Content LIKE '%".$rSearchFor."%'
                    OR
                Text LIKE '%".$rSearchFor."%'
                    OR
                PageTitle LIKE '%".$rSearchFor."%'
            ORDER BY
                SearchScore DESC

I will still be working on soundex though. thanks
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 24

Expert Comment

by:johanntagle
ID: 38752216
I wonder why you still need to use the LIKE '%keyword%' pattern.  That basically kills the benefits of using a fulltext index as MySQL will now need to do a full table scan every time.  Is it because of the minimum number of chars a word must have to be indexed by fulltext and/or certain reserved words and numbers it won't index?  I think you should finetune fulltext and know how to work around the limitations instead.  For example - fulltext won't allow you to search for "c++".  The workaround is to create another column with the same content but with "c++" replaced by "cplusplus", index that column instead of the original, but display the original when showing results.  Of course that also requires you to replace the user-entered search params with the same.  Last time I checked using a fulltext index will cause MySQL to ignore any index in another column and won't index pure numbers, so one-time when I also needed to add a "where company_id=12345" in my select statement, I added a "companyid12345" in my extra column instead and adjusted my match clause to include it in the boolean search.

Alternatively, you can employ a 3rd party engine like sphinx (http://sphinxsearch.com) instead.
0
 
LVL 1

Author Comment

by:tim_carter
ID: 38753092
well i was just doing a simple search like 'each'. That did not get returned by fulltext.
0
 
LVL 24

Expert Comment

by:johanntagle
ID: 38753231
It's because 'each' is a stopword.  See http://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html.  Also see http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html for how to edit/override the stopword list and other fine-tuning methods.
0
 
LVL 1

Author Comment

by:tim_carter
ID: 38753249
haha ok, thanks so muc
0

Featured Post

Does Your Cloud Backup Use Blockchain Technology?

Blockchain technology has already revolutionized finance thanks to Bitcoin. Now it's disrupting other areas, including the realm of data protection. Learn how blockchain is now being used to authenticate backup files and keep them safe from hackers.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this blog post, we’ll look at how using thread_statistics can cause high memory usage.
In this article, I’ll talk about multi-threaded slave statistics printed in MySQL error log file.
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …

801 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question