Solved

controlling output from a group by clause

Posted on 2011-02-15
18
857 Views
Last Modified: 2012-05-11
Hi

If i have a table with columns A,B,C,D and run this query

select (*) from table group by A,B

I believe mysql gives random values for the columns C and D from any member of the group. Is there anyway I can control that. For example if i have

select max(C) , D from table group by A,B

can you get the value for column D anyhow from the row (or any of the rows) in the group that have the maximum value for column C?

thanks
0
Comment
Question by:andieje
  • 11
  • 7
18 Comments
 
LVL 40

Expert Comment

by:Sharath
ID: 34902296
In such case, you need to try joining like this.

select t1.*
  from table1 as t1
  join (select A,B,MAX(C) as C from table1 group by A,B) as t2
    on t1.A = t2.A and t1.B = t2.B and t1.C = t2.C

Open in new window

0
 

Author Comment

by:andieje
ID: 34902539
if i concatenated one of the fields in the subquery, say column E, how would i get that back in the query results?
0
 

Author Comment

by:andieje
ID: 34902565
also this could get me multiple rows per group if multiple rows in the group had the max value. If i only wanted one value per group would i have to group the outer query as well?
0
Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

 
LVL 40

Expert Comment

by:Sharath
ID: 34902621
Can you post some sample data on what are you trying to accomplish?
0
 

Author Comment

by:andieje
ID: 34903060
this is made up data as i can't post real data but I'm sure you'll get the point

table

   A      B       C       D        E
   1       1       10      d1      e1
   1       1        20      d2      e2
   2       2        20    d3      e3
   2       2        30     d4     e4
   2        2        30     d5    e5

If we group by col A and B we get 2 groups containing rows (1,2) and rows(3,4,5)
In the first group the max value for C is 20 and only one row has this value
In the second group the max value for C is 30 and two rows have this value

For group 1 i want to retrieve [in the format column(value) ]

A(1)  B(1)   C(20) D(d2) E(e2)  <===for cols D and E i want the values for the row with max(C) which is row 2

For the second group i only want to retrive one row and it doesnt matter which row out of row 4 or 5 the values for D or E come from as both rows have the max value for C
A(2)   B(2)    C(30)   D(d4 or d5)  E(e4 or e5)
0
 
LVL 40

Expert Comment

by:Sharath
ID: 34903771
You can try like this.
select A,B,max(C) as C,D,E from table1 group by A,B;

Open in new window

0
 
LVL 40

Accepted Solution

by:
Sharath earned 500 total points
ID: 34903782
ignore previous post. try this.
select * 
  from table1 as t1 
 where C = (select max(C) from table1 as t2 where t1.A = t2.A and t1.B = t2.B)
 group by A,B;

Open in new window

0
 

Author Comment

by:andieje
ID: 34906043
that doesn't give me the right answer. I get less groups than i should. Isn't that because in the outer query the value for C is selected randomly from the rows in the group, so you might fail the equality test
0
 

Author Comment

by:andieje
ID: 34906060
there are 90,000 groups when i group by A and B but that query only returns 60,000
0
 

Author Comment

by:andieje
ID: 34906173
this is very strange:

select count(*) from table t1 where C = (select max(C) from table t2 where t1.A = t2.A and t1.B = t2.B)

The results is 70,000. How can there be less rows than the number of groups? At least one item in the group has to have the minimum value for it to be calculated?
0
 

Author Comment

by:andieje
ID: 34906368
Please ignore the last comment as I must have done something wrong. The query in the last comment does return more rows than the number of groups but this query returns less rows than the number of groups
 
select * 
  from table1 as t1 
 where C = (select max(C) from table1 as t2 where t1.A = t2.A and t1.B = t2.B)
 group by A,B;

Open in new window


This query also gives less than the number of groups

select * from table t1 inner join
(select A, B, max(C) as C from table 1 
group by A,B) as t2 on
t1.A = t2.A and
t1.B = t2.B and 
t1.C = t2.C

Open in new window


Both queries return the same figure
0
 
LVL 40

Expert Comment

by:Sharath
ID: 34912559
Can you run these two queries and post the row count.
select count(*) as cnt
  from table1 as t1 
 where C = (select max(C) from table1 as t2 where t1.A = t2.A and t1.B = t2.B)
 group by A,B;

Open in new window

select count(*) as cnt
  from table1 as t1 
 group by A,B;

Open in new window

0
 

Author Comment

by:andieje
ID: 34947378
The first query returns one row per group (with varying counts per group) but the total number of rows is 62744. Both queries in my comment ID: 34906368 return this figure

The second query returns one row per group (with varying counts per group) but the total number of rows returned is 90428

Sorry for the delay in replying.
0
 
LVL 40

Expert Comment

by:Sharath
ID: 34947459
If these two queries are giving you the same count, then my proposed query would be the correct one.
Query1:
 SELECT COUNT(* ) 
   FROM (SELECT DISTINCT A,B 
           FROM table1) AS t1; 

Open in new window

Query2:
 SELECT COUNT(* ) 
   FROM (  SELECT DISTINCT A,B 
             FROM table1 AS t1 
            WHERE C = (SELECT MAX(C) 
                         FROM table1 AS t2 
                        WHERE t1.A = t2.A 
                              AND t1.B = t2.B) 
        GROUP BY A,B) AS t1;

Open in new window

0
 

Author Comment

by:andieje
ID: 34947551
the first query returns 90428 records as you would expect

the second query returns 62744 still
0
 

Author Comment

by:andieje
ID: 34947581
i think i have found the problem - sometimes B is null

I only want 62744 groups (when B is not null) so you were right from the beginning!
0
 

Author Closing Comment

by:andieje
ID: 34947582
thanks
0
 
LVL 40

Expert Comment

by:Sharath
ID: 34947690
Didn't realise that you have NULLs as I do not have control on your data. Finally you got it. Glad I could help.
0

Featured Post

Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
MySQL ERROR 1045 (28000) 2 91
updating table data with inner join 9 45
MySQL Memory Keeps Increasing 4 47
How do uses indexes to maximize MySQL Searches 14 55
Fore-Foreword Today (2016) Maxmind has a new approach to the distribution of its data sets.  This article may be obsolete.  Instead of using the examples here, have a look at the MaxMind API (https://www.maxmind.com/en/geolite2-developer-package). …
More Fun with XML and MySQL – Parsing Delimited String with a Single SQL Statement Are you ready for another of my SQL tidbits?  Hopefully so, as in this adventure, I will be covering a topic that comes up a lot which is parsing a comma (or other…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question