Solved

How do I employ user variables to select x records for each value of a given column?

Posted on 2011-02-26
10
329 Views
Last Modified: 2012-05-11
I have a table: products (product_id, shop_id, date)

For each shop_id in the table, the following query retrieves the 10 rows with the most recent dates
 
set @num := 0, @shop_id := 0;

select shop_id, date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where x.row_number <= 10;

Open in new window


For example, if there are 10 distinct shop_ids, and each of these shop_ids appears in at least 10 rows, the query will return 100 rows, 10 per shop_id. Each set of 10 rows contains the most recent dates for that shop_id, listed in descending order. This query works fine.

In addition to the date and shop_id, I would also like to select the product_id for each row. However this query doesn't work (MySQL hangs indefinitely):
 
set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

Open in new window


I believe my tenuous grasp of user variables is to blame, but I cannot figure out what I am doing wrong.
0
Comment
Question by:bitt3n
  • 6
  • 2
10 Comments
 
LVL 40

Expert Comment

by:Sharath
ID: 34989451
Not sure why your query is not working. Can you try like this?
set @num := 0, @shop_id := 0;

select p.product_id,x.shop_id, x.date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x 
join products p on x.shop_id = p.shop_id and s.date = p.date where x.row_number <= 10;

Open in new window

0
 

Author Comment

by:bitt3n
ID: 34989484
interesting, that works fine, and is just as fast.

what makes the syntax of that query work, but not the syntax of the one I tried?
0
 
LVL 40

Expert Comment

by:Sharath
ID: 34989496
I used the derived table query just to find the row number and then joined that with your table again to get other data.
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 

Author Comment

by:bitt3n
ID: 34989517
yes I get how your query works, I'm just curious why that method is necessary to get the correct result, versus just using a WHERE as my non-functional query does. It seems like they should both work the same way.
0
 

Author Comment

by:bitt3n
ID: 34992120
I've been looking at this a bit more, and discovered that using a JOIN actually does not provide the expected result when a given shop_id/date combination appears in multiple rows.

For example, given two rows of (shop_id,product_id,date): (1,1,1/1/2011), (1,2,1/1/2011), the JOIN (matching on shop_id,date) will pair product_id 1 with both rows, and will pair product_id 2 with both rows. Thus for these 2 rows, the query returns 4 results.

It seems like it is necessary somehow to retrieve the product_ids from the subquery to make the query work as desired.
0
 

Author Comment

by:bitt3n
ID: 34992248
I have tried the problematic query

 
set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

Open in new window


on a very small table (10 rows) and it seems to execute properly, which narrows the question down to why this query causes a problem with a moderately large table (80k rows), while the original query does not.
0
 

Author Comment

by:bitt3n
ID: 34992738
update: the problematic query works as expected (and not noticeably slower than the original query without product_id) up to about 55K rows, but it fails to complete at 65K rows.

I'd like to try to the query using a PHP script just to see if this issue is localized to phpMyAdmin (which I have been using to test these examples), but the original query, which runs fine in phpMyAdmin, gives me a syntax error running in PHP.

 
Query: SET @num :=0, @shop_id := NULL ; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_id, @num +1, 1 ) AS row_number, @shop_id := shop_id AS dummy FROM products ORDER BY shop_id, DATE DESC ) AS x WHERE x.row_number <=10; 
MySQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_' at line 2

Open in new window


if I can figure out why this is happening and correct the problem, I will test in PHP. (I noticed if I remove the user variable initialization I don't get the error, but then the query doesn't run properly.)
0
 

Author Comment

by:bitt3n
ID: 34993145
ok I separated the SET and SELECT queries in separate calls to mysql_query() to solve the syntax problem, and discovered the query works fine when executed through PHP's mysql_query() with 80k rows.

Thus, unless I misunderstand something, the problem appears to be some quirk of phpMyAdmin that stops it from displaying results when the table contains more than ~65K rows.
0
 

Accepted Solution

by:
ee_auto earned 0 total points
ID: 37313427
Question PAQ'd and stored in the solution database.
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Foreword In the years since this article was written, numerous hacking attacks have targeted password-protected web sites.  The storage of client passwords has become a subject of much discussion, some of it useful and some of it misguided.  Of cou…
A lot of articles have been written on splitting mysqldump and grabbing the required tables. A long while back, when Shlomi (http://code.openark.org/blog/mysql/on-restoring-a-single-table-from-mysqldump) had suggested a “sed” way, I actually shell …
This Micro Tutorial will teach you how to censor certain areas of your screen. The example in this video will show a little boy's face being blurred. This will be demonstrated using Adobe Premiere Pro CS6.
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question