Solved

How do I employ user variables to select x records for each value of a given column?

Posted on 2011-02-26
10
327 Views
Last Modified: 2012-05-11
I have a table: products (product_id, shop_id, date)

For each shop_id in the table, the following query retrieves the 10 rows with the most recent dates
 
set @num := 0, @shop_id := 0;

select shop_id, date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where x.row_number <= 10;

Open in new window


For example, if there are 10 distinct shop_ids, and each of these shop_ids appears in at least 10 rows, the query will return 100 rows, 10 per shop_id. Each set of 10 rows contains the most recent dates for that shop_id, listed in descending order. This query works fine.

In addition to the date and shop_id, I would also like to select the product_id for each row. However this query doesn't work (MySQL hangs indefinitely):
 
set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

Open in new window


I believe my tenuous grasp of user variables is to blame, but I cannot figure out what I am doing wrong.
0
Comment
Question by:bitt3n
  • 6
  • 2
10 Comments
 
LVL 40

Expert Comment

by:Sharath
ID: 34989451
Not sure why your query is not working. Can you try like this?
set @num := 0, @shop_id := 0;

select p.product_id,x.shop_id, x.date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x 
join products p on x.shop_id = p.shop_id and s.date = p.date where x.row_number <= 10;

Open in new window

0
 

Author Comment

by:bitt3n
ID: 34989484
interesting, that works fine, and is just as fast.

what makes the syntax of that query work, but not the syntax of the one I tried?
0
 
LVL 40

Expert Comment

by:Sharath
ID: 34989496
I used the derived table query just to find the row number and then joined that with your table again to get other data.
0
 

Author Comment

by:bitt3n
ID: 34989517
yes I get how your query works, I'm just curious why that method is necessary to get the correct result, versus just using a WHERE as my non-functional query does. It seems like they should both work the same way.
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 

Author Comment

by:bitt3n
ID: 34992120
I've been looking at this a bit more, and discovered that using a JOIN actually does not provide the expected result when a given shop_id/date combination appears in multiple rows.

For example, given two rows of (shop_id,product_id,date): (1,1,1/1/2011), (1,2,1/1/2011), the JOIN (matching on shop_id,date) will pair product_id 1 with both rows, and will pair product_id 2 with both rows. Thus for these 2 rows, the query returns 4 results.

It seems like it is necessary somehow to retrieve the product_ids from the subquery to make the query work as desired.
0
 

Author Comment

by:bitt3n
ID: 34992248
I have tried the problematic query

 
set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

Open in new window


on a very small table (10 rows) and it seems to execute properly, which narrows the question down to why this query causes a problem with a moderately large table (80k rows), while the original query does not.
0
 

Author Comment

by:bitt3n
ID: 34992738
update: the problematic query works as expected (and not noticeably slower than the original query without product_id) up to about 55K rows, but it fails to complete at 65K rows.

I'd like to try to the query using a PHP script just to see if this issue is localized to phpMyAdmin (which I have been using to test these examples), but the original query, which runs fine in phpMyAdmin, gives me a syntax error running in PHP.

 
Query: SET @num :=0, @shop_id := NULL ; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_id, @num +1, 1 ) AS row_number, @shop_id := shop_id AS dummy FROM products ORDER BY shop_id, DATE DESC ) AS x WHERE x.row_number <=10; 
MySQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_' at line 2

Open in new window


if I can figure out why this is happening and correct the problem, I will test in PHP. (I noticed if I remove the user variable initialization I don't get the error, but then the query doesn't run properly.)
0
 

Author Comment

by:bitt3n
ID: 34993145
ok I separated the SET and SELECT queries in separate calls to mysql_query() to solve the syntax problem, and discovered the query works fine when executed through PHP's mysql_query() with 80k rows.

Thus, unless I misunderstand something, the problem appears to be some quirk of phpMyAdmin that stops it from displaying results when the table contains more than ~65K rows.
0
 

Accepted Solution

by:
ee_auto earned 0 total points
ID: 37313427
Question PAQ'd and stored in the solution database.
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Loading csv or delimited data files to MySQL database is a very common task frequently questioned about and almost every time LOAD DATA INFILE comes to the rescue. Here we will try to understand some of the very common scenarios for loading data …
Foreword This is an old article.  Instead of using the MySQL extension that was used in the original code examples, please choose one of the currently supported database extensions instead.  More information is available here: MySQLi / PDO (http://…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now