Solved

Get the Max value of each group in SQL

Posted on 2012-12-26
19
434 Views
Last Modified: 2012-12-27
In SQL, How to get the Maximum value of each group.

In my Syntax, I have a column name : FileCount which counts number of records for each FileName.

My output could be like this..

FileName     FileCount

ABC                1
ABC                2
ABC                3
DEF                 1

So, expected results are

FileName     FileCount

ABC                3
DEF                 1

Below is my current syntax ..


With CTE
As
(
select

                        [FileName],
                        [Trade Name],
                        [Invoice Number],
                        [Invoice Date]
                        ,ROW_NUMBER() OVER(PARTITION BY [FileName] ORDER BY [FileName] ASC) AS FileCount
from                  PlacedOrderDetails
where                  CreatedDate >= '12/21/2012'

)
select
                        [FileName],
                        [Trade Name],
                        [Invoice Number],
                        [Invoice Date]
                        
From                  CTE
0
Comment
Question by:chokka
  • 6
  • 4
  • 3
  • +3
19 Comments
 
LVL 12

Expert Comment

by:Jared_S
ID: 38722026
It looks like you're trying to count the file names. You can get there with this:

SELECT [FileName], count(*) as [FileCount]
FROM PlacedOrderDetails
WHERE CreatedDate >= '12/21/2012'
GROUP BY [FileName]
0
 

Author Comment

by:chokka
ID: 38722042
I need Max FileCount, Group by FileName
0
 

Author Comment

by:chokka
ID: 38722045
FileName     FileCount

ABC                1
ABC                2
ABC                3
DEF                 1
GHI                 1
GHI                  2

So, expected results are

FileName     FileCount

ABC                3
DEF                 1
GHI                  2
0
VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 38722047
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 38722052
To find the max value, this should do the trick

SELECT [FileName], Max(FileCount) as [FileCount]
FROM PlacedOrderDetails
WHERE CreatedDate >= '12/21/2012'
GROUP BY [FileName]
0
 

Author Comment

by:chokka
ID: 38722087
@sjwales

I tried your query in the very first attempt, before posting over here.

I have around 8 - 9 columns

Group by expect all the columns to be mentioned.

On doing so, we don't get the MAX (FileCount)
0
 
LVL 12

Expert Comment

by:Jared_S
ID: 38722099
chokka, did you try the code I posted?
It will work without the cte, and should give you the max file count (by way of counting all the files with the same name).

If I've misunderstood your problem, my apologies.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 38722174
Is FileCount unique for each occurrence of FileName ?  If so this might work:

Select othercol1, othercol2, otherco3, FileName, FileCount
from
(
SELECT [FileName], Max(FileCount) as [FileCount]
FROM PlacedOrderDetails
WHERE CreatedDate >= '12/21/2012'
GROUP BY [FileName] 
) as a
join PlacedOrderDetails b on a.FileName = b.FileName and a.FileCount = b.FileCount

Open in new window


Even if it's not unique you could change

Select othercol1, othercol2, otherco3, FileName, FileCount

to

Select DISTINCT othercol1, othercol2, otherco3, FileName, FileCount

Would that work ?
0
 
LVL 69

Expert Comment

by:Qlemo
ID: 38722344
You are correct, you have to enumerate any column not part of an aggregate (min, max, sum, avg, count) in GROUP BY.
That makes sense, because the DBMS cannot decide what you want to see as single values if you have not included them in the GROUP BY. You can for example decide to use the minimum of each other column:
select FileName, min([Trade Name]), min([Invoice Number]), min([Invoice Date]), max(FileCount)
from PlacedOrderDetails
where CreatedDate >= '12/21/2012'

Open in new window

or you want to have the record with the highest FileCount for each FileName, and then the CTE would come into play.

So, you will need to define exactly what result you want to get, including the other columns.
0
 

Author Comment

by:chokka
ID: 38723762
For all the experts for your query suggestion,

Group by is expecting all the columns which i mentioning in the select query.

On providing all the columns in the Group by, i am not able to get the generated value !!
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 38723820
this will do (as by my article), based on your code:

With CTE
As
(
select

                        [FileName],
                        [Trade Name],
                        [Invoice Number],
                        [Invoice Date]
                        ,ROW_NUMBER() OVER(PARTITION BY [FileName] ORDER BY [Invoice Date] ASC) AS FileCount
,ROW_NUMBER() OVER(PARTITION BY [FileName] ORDER BY [Invoice Date] DESC) AS rn
from                  PlacedOrderDetails
where                  CreatedDate >= '12/21/2012'
)
select
                        [FileName],
                        [Trade Name],
                        [Invoice Number],
                        [Invoice Date]
                        
From                  CTE
WHERE rn = 1

Open in new window

0
 
LVL 32

Expert Comment

by:awking00
ID: 38723849
Can you provide some sample data and the expected output? I suspect the solution will utilize a window function such as row_number() as you have shown in your cte expression, but such functions also include count() and max() which might better provide a solution.
0
 

Author Comment

by:chokka
ID: 38723913
@angelIII
 

select
                        [FileName],
                        [Trade Name],
                        [Invoice Number],
                        [Invoice Date]
                       
From                  CTE


We are missing the FileCount.
I included, but FileCount reflected as FileCount= 1.


We need MAX(FileCount)
0
 
LVL 12

Expert Comment

by:Jared_S
ID: 38724084
I don't believe what your trying to do is uncommon or complicated. There seems to be some communication trouble.

Please try this and see if it is any closer to your desired results.

SELECT
[FileName],
[Trade Name],
[Invoice Number],
[Invoice Date],
(SELECT count(*) FROM PlacedOrderDetails O WHERE P.[FileName] = O.[FileName]) as [FileCount]                      
FROM  PlacedOrderDetails P
WHERE  CreatedDate >= '12/21/2012'
0
 

Author Comment

by:chokka
ID: 38724162
Thank you for helping.

Sounds simple, but it is complicated.

I have attached the spreadsheet with sample data and expected output.

Hope this will help me as well as experts can bring new query logic !!
ExpectedOutput.xls
0
 
LVL 12

Assisted Solution

by:Jared_S
Jared_S earned 250 total points
ID: 38724327
Using your sample data to create a variable table called @temp:

declare @temp as table (FileName varchar(50), [Trade Name] varchar(25), [Invoice Number] varchar(10), [Invoice Date] datetime, FileCount int)
insert into @temp values ('A176998  12-11-2012.csv',	'ENALAPRIL', 	'176998',	'11/01/2012',	1)
insert into @temp values ('A176998  12-11-2012.csv',	'SENNA PLUS',	'176998',	'11/01/2012',	2)
insert into @temp values ('B176999  12-11-2012.csv',	'SENNA PLUS',	'176999',	'11/15/2012',	1)
insert into @temp values ('C5043379 11-29-2012.csv',	'ADVAIR DISKUS',	'5043379',	'12/02/2012',	1)
insert into @temp values ('C5043379 11-29-2012.csv',	'ASMANEX 30INHL',	'5043379',	'12/02/2012',	2)
insert into @temp values ('C504337911-29-2012.csv',	'ATRIPLA',	'5043379',	'12/02/2012',	3)
insert into @temp values ('C5043379 11-29-2012.csv',	'BENICAR',	'5043379',	'12/02/2012',	4)

Open in new window


Either of these queries works:
SELECT
DISTINCT FileName, [Invoice Number], [Invoice Date], 
(SELECT max([FileCount]) FROM @temp b WHERE b.FileName = a.FileName) as FileCount
FROM @temp a

SELECT FileName, [Invoice Number], [Invoice Date], max([FileCount]) as FileCount
FROM @temp
GROUP BY FileName, [Invoice Number], [Invoice Date]

Open in new window


You can substitute your CTE name and it should work.
Unless there is some reason to use a CTE that isn't discussed here, you can accomplish your goal more efficiently without it by just querying the PlacedOrderDetails table directly.

I believe that syntax would be
SELECT
DISTINCT [FileName],
[Trade Name],
[Invoice Number],
[Invoice Date],
(SELECT count(*) FROM PlacedOrderDetails O WHERE P.[FileName] = O.[FileName]) as [FileCount]                       
FROM  PlacedOrderDetails P
WHERE  CreatedDate >= '12/21/2012'

Open in new window

0
 
LVL 32

Accepted Solution

by:
awking00 earned 250 total points
ID: 38724368
with cte as
(SELECT
 [FileName],
 [Trade Name],
 [Invoice Number],
 [Invoice Date],
 count([FileName]) over (partition by [FileName] order by [FileName]) as [FileCount],
 row_number() over (partition by [FileName], [Invoice Date] order by [Trade Name] desc)as rn
 FROM PlacedOrderDetails)
SELECT
 [FileName],
 [Trade Name],
 [Invoice Number],
 [Invoice Date],
 [FileCount]
from cte
where rn = 1;
0
 
LVL 32

Expert Comment

by:awking00
ID: 38724379
I omitted the CreatedDate since your example did not show it and none of the invoice dates were greater than or equal to 12/21/2012, but it can easily be included in the common table expression.
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 38724561
I am sure my code works. inside the CTE, I have 2 functions with ROW_NUMBER(), one with ORDER BY invoicedate ASC, and one with DESC ...
rn = 1 to return the one with FILE_COUNT = "max" ...
0

Featured Post

NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SQL Dump exec output to table 3 22
Location of Dynamics AX Service accounts in SQL 3 16
MS SQL Server select from Sub Table 14 26
SQL Group By Question 4 20
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question