SQL excluding using WHERE or HAVING.

Hello,

After some help from here (thankyou to those guys) I've got this query:

$query2 =	'SELECT sq.link_id
			 , sq.score
			  FROM ( SELECT li.link_id, SUM( li.cat_id ) AS score
					   FROM linkindex li
					  WHERE li.word_id IN ('.$word_id_list.') '.$cat_filter.'
					  GROUP BY li.link_id
				   ) sq
			  JOIN links_date_duration ldd 
				ON sq.link_id = ldd.link_id
			WHERE ldd.link_duration '.$dur_range.'
			AND ldd.link_date '.$date_range.'
			ORDER BY sq.score DESC
			LIMIT '.$start.','.$limit;

Open in new window


PHP feeds in various values and everything is working fine.

My question is to further extend this query to NOT INCLUDE certain word_id's - At the mo it returns values with fed in word_ids:
WHERE li.word_id IN ('.$word_id_list.')

Open in new window


What I now want to do is to return these results but EXCLUDE results that contain specific word_ids.

At first I thought:
WHERE li.word_id IN ('.$word_id_list.') AND word_id != '.$exclude_id.' 

Open in new window

OR a multiple variant
WHERE li.word_id IN ('.$word_id_list.') AND word_id NOT IN ('.$exclude_id_list.')

Open in new window


But that did nothing - and then I thought adding a HAVING in there might work but to be honest I don't understand it enough quite yet so thought someone here could easily point me in the right directinn :)

Again, huge thanks for your input.
dolythgoeAsked:
Who is Participating?
 
johanntagleConnect With a Mentor Commented:
Based on http://dev.mysql.com/doc/refman/5.1/en/rewriting-subqueries.html, try this:

SELECT li.link_id, SUM( li.cat_id ) AS score
FROM linkindex li left outer join (select link_id from linkindex where word_id = '.$exclude_id.'  ) li2 on li.link_id=li2.link_id
WHERE li.word_id IN ('.$word_id_list.') '.$cat_filter.'
AND li2.link_id is null
GROUP BY li.link_id

Again, untested (not sure if MySQL supports left outer join with a subselect cast as a table)
0
 
johanntagleCommented:
HAVING is used with GROUP BY and is not the correct solution for your problem.  Try something like this in your subselect:

SELECT li.link_id, SUM( li.cat_id ) AS score
FROM linkindex li
WHERE li.word_id IN ('.$word_id_list.') '.$cat_filter.'
AND li.link_id not in (select link_id from linkindex where word_id = '.$exclude_id.'  )
GROUP BY li.link_id

NOTE: untested.
0
 
woepwobinCommented:
You can keep the query as it is, and have php remove items from $word_id_list that are in $exclude_id_list:

$word_id_list = implode(',', array_diff(explode(',', $word_id_list), explode(',', $exclude_id)));
$query2 =       'SELECT sq.link_id
...

Open in new window

0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
dolythgoeAuthor Commented:
Hey thanks for both your input.

Johan - The sub-select has a bit of a performance hit so if there's a faster way to achieve the same thing would be most interested :)

Woe - no can do unfortunately, the word_id, link_id combination is primary so there are many occurences of the same link_id's with different word_ids - if you just look for the inclusion list it will bring back link_id's that may also have excluded word_id's.

Consider:

word_id | link_id
5 | 1
6 | 1
7 | 1
8 | 1
9 | 1
 
If I include 5 and 7 it will bring back 1 - If I include then just 5 it will bring back 1 - so an exclude part of the query is important.

Cheers
David
0
 
johanntagleCommented:
What indexes do you have on the table?  You might need one for link_id.  Can you run EXPLAIN on the subquery I made?
0
 
dolythgoeAuthor Commented:
Sure,

Here's the table structure:

CREATE TABLE IF NOT EXISTS `linkindex` (
  `link_id` int(10) unsigned NOT NULL,
  `word_id` int(10) unsigned NOT NULL,
  `cat_id` tinyint(3) unsigned NOT NULL,
  `pos` tinyint(3) unsigned NOT NULL,
  PRIMARY KEY (`word_id`,`link_id`,`cat_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Open in new window


And the explain:

id       select_type       table       type       possible_keys       key       key_len       ref       rows       Extra
1       PRIMARY       linkindex       range       PRIMARY       PRIMARY       3       NULL      600719       Using where; Using index; Using temporary; Using filesort
2       DEPENDENT SUBQUERY       linkindex       ref       PRIMARY       PRIMARY       7       const,func       92562       Using index

Speed of query: 3.1416 sec
0
 
dolythgoeAuthor Commented:
Hello,

Sorry to not get back in a while.

Unfortunately, that query did not work - it didn't error but just hung there forever!

Cheers
David
0
 
johanntagleCommented:
See if an index on link_id column only will help.
0
 
johanntagleCommented:
I mean, add an index for link_id column, (don't remove existing indexes)
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
>it didn't error but just hung there forever!
missing index on some fields?
what's the EXPLAIN plan for the query?

0
 
dolythgoeAuthor Commented:
Sry, should have closed this off earlier.
0
All Courses

From novice to tech pro — start learning today.