• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 274
  • Last Modified:

SQL query on a simple table to return realted items

Hi all,

I'm trying to do a query on the following table:

CREATE TABLE `filter_index` (
  `fid` SMALLINT(5) UNSIGNED NOT NULL,
  `link_id` INT(10) UNSIGNED NOT NULL,
  PRIMARY KEY (`fid`,`link_id`)
) ENGINE=MYISAM DEFAULT CHARSET=latin1

Open in new window


It's a many 2 many table - about 6 million rows - fid range is about 650.

Now, the query I'm hoping to do is to lookup an fid (say 20) and return all fid's of which had link_id's associated with the fid 20.

Think of it as 'Ok, now you've picked filter option 20, here's the rest of the filter options available to you from the remaining pool'

Experimental, if there's a smarter way to do this, I'm all ears :)



0
dolythgoe
Asked:
dolythgoe
  • 3
  • 3
  • 3
2 Solutions
 
Kevin CrossChief Technology OfficerCommented:
Cheers, David!

Is this what you are looking for?
It uses the initial fid to find link_id's that you then use in the outer query.

SELECT `fid`, `link_id`
FROM `filter_index` fi1
WHERE EXISTS (
   SELECT 1
   FROM `filter_index` fi2
   WHERE fi2.`fid` = 20
   AND fi2.`link_id` = fi1.`link_id`
)
;

Open in new window

0
 
dolythgoeAuthor Commented:
Hi Kevin!

Cool thanks for that - I did an explain:

id  select_type         table  type    possible_keys  key      key_len  ref                             rows     Extra                    
1   PRIMARY             fi1    index                  PRIMARY  6                                        4040187  Using where; Using index  
2   DEPENDENT SUBQUERY  fi2    eq_ref  PRIMARY        PRIMARY  6        const,rampants_dbp.fi1.link_id  1        Using index              

So it's doing a full table scan, do you think it's possible to improve on that with forcing an index?

Thanks again for your help :)
0
 
johanntagleCommented:
Try with a join instead of a subquery:

select fi1.fid, fi1.link_id
from `filter_index` fi1 join `filter_index` fi2 on (fi1.link_id=fi2.link_id)
where fi2.fid=20

I think you will still need a separate index on link_id though.
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
dolythgoeAuthor Commented:
Ok, so that query was really fast but now I have the annoyance of the GROUP BY or DISTINCT command to make sure it only gives me the unique fid's. Don't need the link_id's

I tried a few things:

SELECT SQL_NO_CACHE fi1.fid
FROM `filter_index` fi1 JOIN `filter_index` fi2 ON (fi1.link_id=fi2.link_id)
WHERE fi2.fid=20
GROUP BY fi1.fid

Open in new window


8 seconds (without GROUP BY = 0.05)

Which the query profiler in SQL yog says most of the time is spent 'Sending data' <- I'm still trying to understand what that actually means!

EXPLAIN

Now, adding an index to link_id...

Re-run the query with GROUP BY

:D

0.045 seconds! MUCH BETTER!

EXPLAIN SELECT SQL_NO_CACHE fi1.fid
FROM `filter_index` fi1 JOIN `filter_index` fi2 ON (fi1.link_id=fi2.link_id)
WHERE fi2.fid=20
GROUP BY fi1.fid

Open in new window


EXPLAIN:

id  select_type  table  type  possible_keys      key        key_len  ref                       rows   Extra                                        
1   SIMPLE       fi2    ref   PRIMARY,NewIndex1  PRIMARY    2        const                     13322  Using index; Using temporary; Using filesort  
1   SIMPLE       fi1    ref   NewIndex1          NewIndex1  4        rampants_dbp.fi2.link_id  3                                                    

Much better - uses filesort and temporary but that's cool as it's in memory.

From here I want to entertain the small matter of 'AND' with different fids.

So this is for an AJAX filter 'grey out' function - so I can ask mysql to give me all appliceable filter options that are still available if a number of filters are picked. That doesn't stop there though, the fid's are in groups so (1,2,3,4,5,6) are colours for example and (7,8,9,10,11,12) are sizes <- like ecommerce product selector.

I'll give you the points but if you could share some brainpower on this I would be most grateful!

SO there are 20 groups, 650 filter_ids and 1-2 million link_id's

Each group has a number of filter_id's which have a number of link_ids.

If filter_id's are picked within a group GROUP 1 (2, 4, 5) selected and GROUP 2 (8,9,10) selected, I want to be able to query so it looks at like:

SHOW ME ALL AVALIABLE fids FROM links subset WHEN fid IS (1 OR 2 OR 3) AND (3 OR 4 OR 5)

An example that uses this type of thing is asos.com - have a play with their product filter and you'll see what I mean - it greys out other options as and when you pick filter options. Great functionality but at how much computing cost I wonder..

I can't thank you enough for helping me out!











0
 
Kevin CrossChief Technology OfficerCommented:
Very good catch on the JOIN.
0
 
dolythgoeAuthor Commented:
Ahha so I've put together some answers from previous posts and came up with this (think Johann picked up on this one):

SELECT fi1.fid
FROM `filter_index` fi1 
JOIN `filter_index` fi2 ON (fi1.link_id=fi2.link_id)
JOIN `filter_index` fi3 ON (fi1.link_id=fi3.link_id)
JOIN `filter_index` fi4 ON (fi1.link_id=fi4.link_id)
JOIN `filter_index` fi5 ON (fi1.link_id=fi5.link_id)
WHERE fi2.fid IN (50, 25, 30)
AND fi3.fid IN (38)
AND fi4.fid IN (5)
AND fi5.fid IN (4)
GROUP BY fi1.fid

Open in new window


Where I'm joining for every group and can use IN() to do the ORs and the JOINs to do the ANDs..

This seems to work pretty fast - is that the holy grail I seek do you think?
0
 
Kevin CrossChief Technology OfficerCommented:
Seems so given your requirement and millions of rows. Very nice work, johanntagle!
0
 
johanntagleCommented:
Thanks mwvisa1!

dolythgoe - yes most of the time joins do better than subqueries provided the join columns are also indexed.
0
 
johanntagleCommented:
oh, and glad I could help! =)
0

Featured Post

Restore individual SQL databases with ease

Veeam Explorer for Microsoft SQL Server delivers an easy-to-use, wizard-driven interface for restoring your databases from a backup. No expert SQL background required. Web interface provides a complete view of all available SQL databases to simplify the recovery of lost database

  • 3
  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now