?
Solved

mySQL FULLTEXT search with custom weighting

Posted on 2011-09-29
8
Medium Priority
?
556 Views
Last Modified: 2012-05-12
Hi all,

I've been playing around with FULLTEXT to apply my own weighting.

Here's the query:

SELECT link_id, link_title,
	((MATCH(link_title) AGAINST ('word search' IN BOOLEAN MODE)* 5) +
	(MATCH(link_tags) AGAINST ('word search' IN BOOLEAN MODE)) +
	(MATCH(link_category) AGAINST ('word search' IN BOOLEAN MODE)* 3)) AS score
FROM textdump
WHERE
	MATCH(link_title, link_tags, link_category) AGAINST ('word search' IN BOOLEAN MODE)
ORDER BY
	score DESC
LIMIT
	0,16;

Open in new window


There's 4 FULLTEXT indexes:

link_title,
link_tags,
link_category,
link_title, link_tags, link_category

On 1.6 million rows

This takes 8 seconds and I'm wondering if there's a way to apply a weighting within the MATCH() function or something so I don't need to use the 3 SELECT MATCH()'S to speed it up - or just a much better way!

EXPLAIN doesn't say much:

id  select_type  table     type      possible_keys                 key                           key_len  ref  rows  Extra                        
1   SIMPLE       textdump  fulltext  fulltext_title_tags_category  fulltext_title_tags_category  0             1     Using where; Using filesort  

Any pointers much appreciated!


0
Comment
Question by:dolythgoe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
8 Comments
 
LVL 60

Accepted Solution

by:
Kevin Cross earned 2000 total points
ID: 36819023
I do not believe there is a better way to get the score other than I do not believe you want the IN BOOLEAN MODE there. The IN BOOLEAN MODE is for the WHERE condition where as you want the SELECT to return a "score" which you can then weight based on which column it was. Because of the nature of FULLTEXT, I am not sure what more you could do unless you have a more restrictive filter you can put on the data first.
0
 

Author Comment

by:dolythgoe
ID: 36819039
hmm you say more restrictive filter - perhaps a subselect might work better in this case? get a quick match first and then do a series fo smaller matches with scores within those? Just speculating but a theory...
0
 

Author Comment

by:dolythgoe
ID: 36819045
..actually that's what the WHERE is doing anyway, forget that!
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36819055
Yes, I was trying to think of a good way to show it, but not knowing the data. I also saw in your other question you are using PHP and trying to see if there is something you can do with that. One thought is that you start with just:

SELECT link_id, link_title, (MATCH(link_title) AGAINST ('word search') score
FROM textdump
WHERE MATCH(link_title) AGAINST ('word search' IN BOOLEAN MODE)
ORDER BY score DESC
LIMIT 0,16;

If you do not get 16 rows, then you can move on to grab the next rows from link_category matches.

Not sure if that works, but just a thought.
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36819070
Yes, you are sub selecting based on what matches, but then you have to get scores for that sub select, order by the whole row set, then limit to 16 rows. If this is a much more than 16 then the order by process can be your biggest headache. If the weighting of the scores always pushes a out 16 matches on link_title my suggestion may work out.
0
 

Author Comment

by:dolythgoe
ID: 36819083
Yes, this one is actually for a search itself but you know what, I'm kind of glad this has turned out not to be the best way as I developed a full set of indexing scripts to create many-to-many id tables for word_id 2 link_id which is scalable and scans pretty quick (aslong as you assign enough temp_table_size memory) which also scores in the way I want. I was getting scared that fulltext was faster and I guess it is when using it's own scoring and index but without being able to influence the weighting used in its algorithm you're stuck with MySQL's interpretation.

A few upgrades to fulltext would be handy, a way of indexing across multiple columns and add a multiplier to the score for those columns would be awesome!

Thanks for your help again on this subject...never an easy one!
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36819103
Very nice. Glad you got what you needed.
Best regards and happy coding,

Kevin
0
 

Author Closing Comment

by:dolythgoe
ID: 36819135
More of a confirmation than a solution - fulltext is great for some things but lacks flexibility in places. Might send this idea to MySQL..

Thanks Kevin
0

Featured Post

Get MongoDB database support online, now!

At Percona’s web store you can order your MongoDB database support needs in minutes. No hassles, no fuss, just pick and click. Pay online with a credit card. Handle your MongoDB database support now!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

As a database administrator, you may need to audit your table(s) to determine whether the data types are optimal for your real-world data needs.  This Article is intended to be a resource for such a task. Preface The other day, I was involved …
When table data gets too large to manage or queries take too long to execute the solution is often to buy bigger hardware or assign more CPUs and memory resources to the machine to solve the problem. However, the best, cheapest and most effective so…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question