trippy1976
asked on
MYSQL Query with subquery is horribly slow
In my database I have three tables in play for this scenario:
"things" contains about 13,000 rows of things including these columns:
- thingid
- ownerid (reference to who owns that thing in the users table)
- name (name of thing)
- created (datetime for when the thing was made)
"users" contains about 65,000 rows for users including:
- id (id of user)
- screenname (nickname for user)
"collections" contains about 840,000 rows for "things" that have been collected
- ownerid (who owns this collected thing - DIFFERENT than who owns the thing)
- thingid (reference to entry in things table)
"wishlist" contains about 9,000 rows for wishlist items. These are lists of things people want but do not have.
- thingid
This may get convoluted, but let's take a use case:
- You sign up on the site as user #1 with username "joe"
- You create a "thing" which gets one single entry in the thing table with thingid T1
- You start giving these things away (which you have many of)
- I am user #2
- You gave me one of your things
- I added an entry for your thing to my collection
- This created a row in the "collections" table with ownerid = 2 and thingid = T1
- User #3 wants the thing made by user #1 but doesn't have one
- User #3 creates a wishlist entry
- This creates a row in the wishlist table for thingid T1
Now, as user #2 - I want to view my collection. I also want to see if someone had one of my items on their wishlist.
That is what the query below is intended to do. It selects all the items from my collection (remember, I'm user #2) and grabs the thing name, thing created, and user screen name of the thing's owner (that's you, "joe")
I also want to know for each item in user #2's collection, is there at least one entry for it in the wishlist table. So I'm doing a subquery.
WITH the subquery, it's taking 30+ seconds to return this.
If I take out the subquery from the select, there is no measurable difference.
If I take out the subquery from the "where" statement, it runs about 16 seconds
If I take out both subqueries it runs in a fraction of a second.
How can I accomplish what I am trying to do most effectively? Is there some kind of join that would work better? I tried doing something like "where InAWishlist > 0" in the where clause, attempting to use the results of the first subquery like a column test but I get an unknown column error.
"things" contains about 13,000 rows of things including these columns:
- thingid
- ownerid (reference to who owns that thing in the users table)
- name (name of thing)
- created (datetime for when the thing was made)
"users" contains about 65,000 rows for users including:
- id (id of user)
- screenname (nickname for user)
"collections" contains about 840,000 rows for "things" that have been collected
- ownerid (who owns this collected thing - DIFFERENT than who owns the thing)
- thingid (reference to entry in things table)
"wishlist" contains about 9,000 rows for wishlist items. These are lists of things people want but do not have.
- thingid
This may get convoluted, but let's take a use case:
- You sign up on the site as user #1 with username "joe"
- You create a "thing" which gets one single entry in the thing table with thingid T1
- You start giving these things away (which you have many of)
- I am user #2
- You gave me one of your things
- I added an entry for your thing to my collection
- This created a row in the "collections" table with ownerid = 2 and thingid = T1
- User #3 wants the thing made by user #1 but doesn't have one
- User #3 creates a wishlist entry
- This creates a row in the wishlist table for thingid T1
Now, as user #2 - I want to view my collection. I also want to see if someone had one of my items on their wishlist.
That is what the query below is intended to do. It selects all the items from my collection (remember, I'm user #2) and grabs the thing name, thing created, and user screen name of the thing's owner (that's you, "joe")
I also want to know for each item in user #2's collection, is there at least one entry for it in the wishlist table. So I'm doing a subquery.
WITH the subquery, it's taking 30+ seconds to return this.
If I take out the subquery from the select, there is no measurable difference.
If I take out the subquery from the "where" statement, it runs about 16 seconds
If I take out both subqueries it runs in a fraction of a second.
How can I accomplish what I am trying to do most effectively? Is there some kind of join that would work better? I tried doing something like "where InAWishlist > 0" in the where clause, attempting to use the results of the first subquery like a column test but I get an unknown column error.
SELECT
c.*, t.ownerid, t.name, t.created, u.screenname,
IF((SELECT COUNT(*) FROM wishlist AS W WHERE W.thingid=c.thingid)>0,1,0) AS InAWishlist
FROM
users u, collections c, things t
WHERE
c.ownerid = '2' AND
(SELECT COUNT(*) FROM wishlist AS W WHERE W.thingid=c.thingid) > 0 AND
t.id = c.thingid AND
u.id = t.ownerid
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
One other thing you can do is eliminate the If statement in your query because it will always evaluate to a 1 because of the select count(*).... condition you put on the query in the where clause.
And, as Ray said, create indexes on all of the id columns you are using in your where clause.
I had a query one time that "never" completed (I couldn't wait long enough) that completed in less than a second once I applied the proper indexes.
And, as Ray said, create indexes on all of the id columns you are using in your where clause.
I had a query one time that "never" completed (I couldn't wait long enough) that completed in less than a second once I applied the proper indexes.
ASKER
Ray, you seem to post everywhere and always have great info.
At first I was like "c'mon guys, give me some credit, of COURSE I have indexes on all my tables."
Hrm. Well, I thought I did. I added indexes for all the ID stuff and holy camoly. That certainly addresses the performance issue to my full and complete satisfaction.
I'll study the other ideas here too, I'm sure they would help but the query is now under one second. That's crazy.
Many thanks to all
At first I was like "c'mon guys, give me some credit, of COURSE I have indexes on all my tables."
Hrm. Well, I thought I did. I added indexes for all the ID stuff and holy camoly. That certainly addresses the performance issue to my full and complete satisfaction.
I'll study the other ideas here too, I'm sure they would help but the query is now under one second. That's crazy.
Many thanks to all
Yeah, I've missed an index before, so I recognize the symptoms! I'm not really everywhere - I only look at questions in the PHP Zone and a few others when I get a Neglected Question Alert.
Thanks for the points and for your kind words, ~Ray
Thanks for the points and for your kind words, ~Ray
Open in new window
instead of
Open in new window
as it doesn't have to count all rows, butstops on first finding.