bitt3n
asked on
How do I employ user variables to select x records for each value of a given column?
I have a table: products (product_id, shop_id, date)
For each shop_id in the table, the following query retrieves the 10 rows with the most recent dates
For example, if there are 10 distinct shop_ids, and each of these shop_ids appears in at least 10 rows, the query will return 100 rows, 10 per shop_id. Each set of 10 rows contains the most recent dates for that shop_id, listed in descending order. This query works fine.
In addition to the date and shop_id, I would also like to select the product_id for each row. However this query doesn't work (MySQL hangs indefinitely):
I believe my tenuous grasp of user variables is to blame, but I cannot figure out what I am doing wrong.
For each shop_id in the table, the following query retrieves the 10 rows with the most recent dates
set @num := 0, @shop_id := 0;
select shop_id, date
from (
select shop_id, date,
@num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
@shop_id := shop_id as dummy
from products
order by shop_id, date DESC
) as x where x.row_number <= 10;
For example, if there are 10 distinct shop_ids, and each of these shop_ids appears in at least 10 rows, the query will return 100 rows, 10 per shop_id. Each set of 10 rows contains the most recent dates for that shop_id, listed in descending order. This query works fine.
In addition to the date and shop_id, I would also like to select the product_id for each row. However this query doesn't work (MySQL hangs indefinitely):
set @num := 0, @shop_id := NULL;
select product_id, shop_id, date
from (
select product_id, shop_id, date,
@num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
@shop_id := shop_id as dummy
from products
order by shop_id, date DESC
) as x where row_number <= 10;
I believe my tenuous grasp of user variables is to blame, but I cannot figure out what I am doing wrong.
ASKER
interesting, that works fine, and is just as fast.
what makes the syntax of that query work, but not the syntax of the one I tried?
what makes the syntax of that query work, but not the syntax of the one I tried?
I used the derived table query just to find the row number and then joined that with your table again to get other data.
ASKER
yes I get how your query works, I'm just curious why that method is necessary to get the correct result, versus just using a WHERE as my non-functional query does. It seems like they should both work the same way.
ASKER
I've been looking at this a bit more, and discovered that using a JOIN actually does not provide the expected result when a given shop_id/date combination appears in multiple rows.
For example, given two rows of (shop_id,product_id,date): (1,1,1/1/2011), (1,2,1/1/2011), the JOIN (matching on shop_id,date) will pair product_id 1 with both rows, and will pair product_id 2 with both rows. Thus for these 2 rows, the query returns 4 results.
It seems like it is necessary somehow to retrieve the product_ids from the subquery to make the query work as desired.
For example, given two rows of (shop_id,product_id,date):
It seems like it is necessary somehow to retrieve the product_ids from the subquery to make the query work as desired.
ASKER
I have tried the problematic query
on a very small table (10 rows) and it seems to execute properly, which narrows the question down to why this query causes a problem with a moderately large table (80k rows), while the original query does not.
set @num := 0, @shop_id := NULL;
select product_id, shop_id, date
from (
select product_id, shop_id, date,
@num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
@shop_id := shop_id as dummy
from products
order by shop_id, date DESC
) as x where row_number <= 10;
on a very small table (10 rows) and it seems to execute properly, which narrows the question down to why this query causes a problem with a moderately large table (80k rows), while the original query does not.
ASKER
update: the problematic query works as expected (and not noticeably slower than the original query without product_id) up to about 55K rows, but it fails to complete at 65K rows.
I'd like to try to the query using a PHP script just to see if this issue is localized to phpMyAdmin (which I have been using to test these examples), but the original query, which runs fine in phpMyAdmin, gives me a syntax error running in PHP.
if I can figure out why this is happening and correct the problem, I will test in PHP. (I noticed if I remove the user variable initialization I don't get the error, but then the query doesn't run properly.)
I'd like to try to the query using a PHP script just to see if this issue is localized to phpMyAdmin (which I have been using to test these examples), but the original query, which runs fine in phpMyAdmin, gives me a syntax error running in PHP.
Query: SET @num :=0, @shop_id := NULL ; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_id, @num +1, 1 ) AS row_number, @shop_id := shop_id AS dummy FROM products ORDER BY shop_id, DATE DESC ) AS x WHERE x.row_number <=10;
MySQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_' at line 2
if I can figure out why this is happening and correct the problem, I will test in PHP. (I noticed if I remove the user variable initialization I don't get the error, but then the query doesn't run properly.)
ASKER
ok I separated the SET and SELECT queries in separate calls to mysql_query() to solve the syntax problem, and discovered the query works fine when executed through PHP's mysql_query() with 80k rows.
Thus, unless I misunderstand something, the problem appears to be some quirk of phpMyAdmin that stops it from displaying results when the table contains more than ~65K rows.
Thus, unless I misunderstand something, the problem appears to be some quirk of phpMyAdmin that stops it from displaying results when the table contains more than ~65K rows.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Open in new window