I'm trying to work through a particular challenge with SQL to get to an end result which will score proximity of word_ids. I've got this table:
CREATE TABLE IF NOT EXISTS `linkindex` (
`link_id` int(10) unsigned NOT NULL,
`word_id` mediumint(10) unsigned NOT NULL,
`cat_id` tinyint(3) unsigned NOT NULL,
`pos` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`word_id`,`link_id`,`cat_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
And I have this search query:
SELECT link_id, SUM(cat_id) AS score
WHERE word_id IN ('.$word_id_list.')
GROUP BY link_id
ORDER BY score DESC
PHP feeds a series or word_ids into the IN (1,2,3,4,5,6) section.
As you can see,at the mo this query orders by the score on the cat_id column but I want to use the pos column to score proximity.
So a run through example - words 'one' 'two' are converted to id numbers 1 and 2.
These are then placed in the query to find matches on the word_id column in linkindex. The pos is the position of these words in the sentance when the index was first built up.
What I'm looking to do is apply some maths on the positions to find the 'distance' between the two matches with a view to building an overall relevancy score.
Apologies if I sound confusing! Any pointers on what I have here would be a massive help and I'm sure a good challenge for an SQL guru!