Link to home
Start Free TrialLog in
Avatar of bitt3n
bitt3n

asked on

How do I employ user variables to select x records for each value of a given column?

I have a table: products (product_id, shop_id, date)

For each shop_id in the table, the following query retrieves the 10 rows with the most recent dates
 
set @num := 0, @shop_id := 0;

select shop_id, date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where x.row_number <= 10;

Open in new window


For example, if there are 10 distinct shop_ids, and each of these shop_ids appears in at least 10 rows, the query will return 100 rows, 10 per shop_id. Each set of 10 rows contains the most recent dates for that shop_id, listed in descending order. This query works fine.

In addition to the date and shop_id, I would also like to select the product_id for each row. However this query doesn't work (MySQL hangs indefinitely):
 
set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

Open in new window


I believe my tenuous grasp of user variables is to blame, but I cannot figure out what I am doing wrong.
Avatar of Sharath S
Sharath S
Flag of United States of America image

Not sure why your query is not working. Can you try like this?
set @num := 0, @shop_id := 0;

select p.product_id,x.shop_id, x.date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x 
join products p on x.shop_id = p.shop_id and s.date = p.date where x.row_number <= 10;

Open in new window

Avatar of bitt3n
bitt3n

ASKER

interesting, that works fine, and is just as fast.

what makes the syntax of that query work, but not the syntax of the one I tried?
I used the derived table query just to find the row number and then joined that with your table again to get other data.
Avatar of bitt3n

ASKER

yes I get how your query works, I'm just curious why that method is necessary to get the correct result, versus just using a WHERE as my non-functional query does. It seems like they should both work the same way.
Avatar of bitt3n

ASKER

I've been looking at this a bit more, and discovered that using a JOIN actually does not provide the expected result when a given shop_id/date combination appears in multiple rows.

For example, given two rows of (shop_id,product_id,date): (1,1,1/1/2011), (1,2,1/1/2011), the JOIN (matching on shop_id,date) will pair product_id 1 with both rows, and will pair product_id 2 with both rows. Thus for these 2 rows, the query returns 4 results.

It seems like it is necessary somehow to retrieve the product_ids from the subquery to make the query work as desired.
Avatar of bitt3n

ASKER

I have tried the problematic query

 
set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

Open in new window


on a very small table (10 rows) and it seems to execute properly, which narrows the question down to why this query causes a problem with a moderately large table (80k rows), while the original query does not.
Avatar of bitt3n

ASKER

update: the problematic query works as expected (and not noticeably slower than the original query without product_id) up to about 55K rows, but it fails to complete at 65K rows.

I'd like to try to the query using a PHP script just to see if this issue is localized to phpMyAdmin (which I have been using to test these examples), but the original query, which runs fine in phpMyAdmin, gives me a syntax error running in PHP.

 
Query: SET @num :=0, @shop_id := NULL ; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_id, @num +1, 1 ) AS row_number, @shop_id := shop_id AS dummy FROM products ORDER BY shop_id, DATE DESC ) AS x WHERE x.row_number <=10; 
MySQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_' at line 2

Open in new window


if I can figure out why this is happening and correct the problem, I will test in PHP. (I noticed if I remove the user variable initialization I don't get the error, but then the query doesn't run properly.)
Avatar of bitt3n

ASKER

ok I separated the SET and SELECT queries in separate calls to mysql_query() to solve the syntax problem, and discovered the query works fine when executed through PHP's mysql_query() with 80k rows.

Thus, unless I misunderstand something, the problem appears to be some quirk of phpMyAdmin that stops it from displaying results when the table contains more than ~65K rows.
ASKER CERTIFIED SOLUTION
Avatar of ee_auto
ee_auto

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial