Link to home
Start Free TrialLog in
Avatar of trippy1976
trippy1976Flag for United States of America

asked on

MYSQL Query with subquery is horribly slow

In my database I have three tables in play for this scenario:

"things" contains about 13,000 rows of things including these columns:
- thingid
- ownerid (reference to who owns that thing in the users table)
- name (name of thing)
- created (datetime for when the thing was made)

"users" contains about 65,000 rows for users including:
- id (id of user)
- screenname (nickname for user)

"collections" contains about 840,000 rows for "things" that have been collected
- ownerid (who owns this collected thing - DIFFERENT than who owns the thing)
- thingid (reference to entry in things table)

"wishlist" contains about 9,000 rows for wishlist items.  These are lists of things people want but do not have.
- thingid

This may get convoluted, but let's take a use case:

- You sign up on the site as user #1 with username "joe"
- You create a "thing" which gets one single entry in the thing table with thingid T1
- You start giving these things away (which you have many of)

- I am user #2
- You gave me one of your things
- I added an entry for your thing to my collection
- This created a row in the "collections" table with ownerid = 2 and thingid = T1

- User #3 wants the thing made by user #1 but doesn't have one
- User #3 creates a wishlist entry
- This creates a row in the wishlist table for thingid T1

Now, as user #2 - I want to view my collection.  I also want to see if someone had one of my items on their wishlist.  

That is what the query below is intended to do.  It selects all the items from my collection (remember, I'm user #2) and grabs the thing name, thing created, and user screen name of the thing's owner (that's you, "joe")

I also want to know for each item in user #2's collection, is there at least one entry for it in the wishlist table.  So I'm doing a subquery.

WITH the subquery, it's taking 30+ seconds to return this.

If I take out the subquery from the select, there is no measurable difference.  

If I take out the subquery from the "where" statement, it runs about 16 seconds

If I take out both subqueries it runs in a fraction of a second.

How can I accomplish what I am trying to do most effectively?  Is there some kind of join that would work better?  I tried doing something like "where InAWishlist > 0" in the where clause, attempting to use the results of the first subquery like a column test but I get an unknown column error.

SELECT 
	c.*, t.ownerid, t.name, t.created, u.screenname,
	IF((SELECT COUNT(*) FROM wishlist AS W WHERE W.thingid=c.thingid)>0,1,0) AS InAWishlist 
FROM 
	users u, collections c, things t 
WHERE 
	c.ownerid = '2' AND 
	(SELECT COUNT(*) FROM wishlist AS W WHERE W.thingid=c.thingid) > 0 AND 
	t.id = c.thingid AND 
	u.id = t.ownerid 

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of DerZauberer
DerZauberer

Dunno if that's same for mysql, but in other sql it's better to use something like

EXISTS (SELECT 1 as dummy FROM wl AS W WHERE W.id=c.id)

Open in new window


instead of
(SELECT COUNT(*) ... )>0

Open in new window

as it doesn't have to count all rows, but
stops on first finding.
One other thing you can do is eliminate the If statement in your query because it will always evaluate to a 1 because of the select count(*).... condition you put on the query in the where clause.

And, as Ray said, create indexes on all of the id columns you are using in your where clause.

I had a query one time that "never" completed (I couldn't wait long enough) that completed in less than a second once I applied the proper indexes.
Avatar of trippy1976

ASKER

Ray, you seem to post everywhere and always have great info.

At first I was like "c'mon guys, give me some credit, of COURSE I have indexes on all my tables."

Hrm.  Well, I thought I did.  I added indexes for all the ID stuff and holy camoly.  That certainly addresses the performance issue to my full and complete satisfaction.

I'll study the other ideas here too, I'm sure they would help but the query is now under one second.  That's crazy.

Many thanks to all
Yeah, I've missed an index before, so I recognize the symptoms!  I'm not really everywhere - I only look at questions in the PHP Zone and a few others when I get a Neglected Question Alert.

Thanks for the points and for your kind words, ~Ray