SQL query on a simple table to return realted items

Hi all,

I'm trying to do a query on the following table:

CREATE TABLE `filter_index` (
  `link_id` INT(10) UNSIGNED NOT NULL,
  PRIMARY KEY (`fid`,`link_id`)

Open in new window

It's a many 2 many table - about 6 million rows - fid range is about 650.

Now, the query I'm hoping to do is to lookup an fid (say 20) and return all fid's of which had link_id's associated with the fid 20.

Think of it as 'Ok, now you've picked filter option 20, here's the rest of the filter options available to you from the remaining pool'

Experimental, if there's a smarter way to do this, I'm all ears :)

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Kevin CrossChief Technology OfficerCommented:
Cheers, David!

Is this what you are looking for?
It uses the initial fid to find link_id's that you then use in the outer query.

SELECT `fid`, `link_id`
FROM `filter_index` fi1
   FROM `filter_index` fi2
   WHERE fi2.`fid` = 20
   AND fi2.`link_id` = fi1.`link_id`

Open in new window

dolythgoeAuthor Commented:
Hi Kevin!

Cool thanks for that - I did an explain:

id  select_type         table  type    possible_keys  key      key_len  ref                             rows     Extra                    
1   PRIMARY             fi1    index                  PRIMARY  6                                        4040187  Using where; Using index  
2   DEPENDENT SUBQUERY  fi2    eq_ref  PRIMARY        PRIMARY  6        const,rampants_dbp.fi1.link_id  1        Using index              

So it's doing a full table scan, do you think it's possible to improve on that with forcing an index?

Thanks again for your help :)
Try with a join instead of a subquery:

select fi1.fid, fi1.link_id
from `filter_index` fi1 join `filter_index` fi2 on (fi1.link_id=fi2.link_id)
where fi2.fid=20

I think you will still need a separate index on link_id though.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

dolythgoeAuthor Commented:
Ok, so that query was really fast but now I have the annoyance of the GROUP BY or DISTINCT command to make sure it only gives me the unique fid's. Don't need the link_id's

I tried a few things:

FROM `filter_index` fi1 JOIN `filter_index` fi2 ON (fi1.link_id=fi2.link_id)
WHERE fi2.fid=20
GROUP BY fi1.fid

Open in new window

8 seconds (without GROUP BY = 0.05)

Which the query profiler in SQL yog says most of the time is spent 'Sending data' <- I'm still trying to understand what that actually means!


Now, adding an index to link_id...

Re-run the query with GROUP BY


0.045 seconds! MUCH BETTER!

FROM `filter_index` fi1 JOIN `filter_index` fi2 ON (fi1.link_id=fi2.link_id)
WHERE fi2.fid=20
GROUP BY fi1.fid

Open in new window


id  select_type  table  type  possible_keys      key        key_len  ref                       rows   Extra                                        
1   SIMPLE       fi2    ref   PRIMARY,NewIndex1  PRIMARY    2        const                     13322  Using index; Using temporary; Using filesort  
1   SIMPLE       fi1    ref   NewIndex1          NewIndex1  4        rampants_dbp.fi2.link_id  3                                                    

Much better - uses filesort and temporary but that's cool as it's in memory.

From here I want to entertain the small matter of 'AND' with different fids.

So this is for an AJAX filter 'grey out' function - so I can ask mysql to give me all appliceable filter options that are still available if a number of filters are picked. That doesn't stop there though, the fid's are in groups so (1,2,3,4,5,6) are colours for example and (7,8,9,10,11,12) are sizes <- like ecommerce product selector.

I'll give you the points but if you could share some brainpower on this I would be most grateful!

SO there are 20 groups, 650 filter_ids and 1-2 million link_id's

Each group has a number of filter_id's which have a number of link_ids.

If filter_id's are picked within a group GROUP 1 (2, 4, 5) selected and GROUP 2 (8,9,10) selected, I want to be able to query so it looks at like:

SHOW ME ALL AVALIABLE fids FROM links subset WHEN fid IS (1 OR 2 OR 3) AND (3 OR 4 OR 5)

An example that uses this type of thing is asos.com - have a play with their product filter and you'll see what I mean - it greys out other options as and when you pick filter options. Great functionality but at how much computing cost I wonder..

I can't thank you enough for helping me out!

Kevin CrossChief Technology OfficerCommented:
Very good catch on the JOIN.
dolythgoeAuthor Commented:
Ahha so I've put together some answers from previous posts and came up with this (think Johann picked up on this one):

SELECT fi1.fid
FROM `filter_index` fi1 
JOIN `filter_index` fi2 ON (fi1.link_id=fi2.link_id)
JOIN `filter_index` fi3 ON (fi1.link_id=fi3.link_id)
JOIN `filter_index` fi4 ON (fi1.link_id=fi4.link_id)
JOIN `filter_index` fi5 ON (fi1.link_id=fi5.link_id)
WHERE fi2.fid IN (50, 25, 30)
AND fi3.fid IN (38)
AND fi4.fid IN (5)
AND fi5.fid IN (4)
GROUP BY fi1.fid

Open in new window

Where I'm joining for every group and can use IN() to do the ORs and the JOINs to do the ANDs..

This seems to work pretty fast - is that the holy grail I seek do you think?
Kevin CrossChief Technology OfficerCommented:
Seems so given your requirement and millions of rows. Very nice work, johanntagle!
Thanks mwvisa1!

dolythgoe - yes most of the time joins do better than subqueries provided the join columns are also indexed.
oh, and glad I could help! =)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
MySQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.