?
Solved

Help on understanding grouping in windowing functions

Posted on 2012-08-13
13
Medium Priority
?
344 Views
Last Modified: 2012-08-14
Hi Experts,

I'm new to trying the windowing functions and I'm having trouble "grouping" variables the way I need them.

For example with the below, I want to see who are the top vendors billed by region this year.

The problem is that a lot of vendors repeat rows instead of aggregating so I'll see
for Region A:

Vendor X shows up billed $139 then a few rows down $1599...is there a way to force Vendor X to summarize?

VENDOR      REGION      TOTAL_PAY_AMT      SalesCnt      SalesTtl          SalesAvg      SalesPct
x      1      139                    123281      53309440.78      432.42      0.01
x      1      1599                    123281      53309440.78      432.42      0.01


DECLARE @YR VARCHAR(4)
SET @YR = '2012'

SELECT DISTINCT
  VENDOR,
  REGION,
  TOTAL_PAY_AMT,
  COUNT(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesCnt,
  SUM(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesTtl,
  AVG(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesAvg,
  TOTAL_PAY_AMT / SUM(TOTAL_PAY_AMT)
    OVER(PARTITION BY REGION) AS SalesPct
FROM
  MASTER_CLAIM
WHERE
  (REGION IS NOT NULL) AND (DATEPART(YEAR, INV_DT) = @YR)
   AND (TOTAL_PAY_AMT > 0) ;
0
Comment
Question by:britpopfan74
  • 7
  • 4
  • 2
13 Comments
 
LVL 15

Expert Comment

by:Ess Kay
ID: 38288154
you can make your query a subquery



DECLARE @YR VARCHAR(4)
SET @YR = '2012'


select
  VENDOR,
  REGION,
  sum(TOTAL_PAY_AMT) as total


from (


SELECT DISTINCT
  VENDOR,
  REGION,
  TOTAL_PAY_AMT,
  COUNT(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesCnt,
  SUM(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesTtl,
  AVG(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesAvg,
  TOTAL_PAY_AMT / SUM(TOTAL_PAY_AMT)
    OVER(PARTITION BY REGION) AS SalesPct
FROM
  MASTER_CLAIM
WHERE
  (REGION IS NOT NULL) AND (DATEPART(YEAR, INV_DT) = @YR)
   AND (TOTAL_PAY_AMT > 0) ;


)

GROUP BY VENDOR,
  REGION
0
 

Author Comment

by:britpopfan74
ID: 38288192
great...but I've the semi-colon is causing an error where it is; I'm trying to figure out where to put it?
0
 
LVL 15

Expert Comment

by:Ess Kay
ID: 38288271
didnt notice


remove it altogether its sql not C++
0
Veeam and MySQL: How to Perform Backup & Recovery

MySQL and the MariaDB variant are among the most used databases in Linux environments, and many critical applications support their data on them. Watch this recorded webinar to find out how Veeam Backup & Replication allows you to get consistent backups of MySQL databases.

 

Author Comment

by:britpopfan74
ID: 38288434
I keep getting: Incorrect syntax near the keyword 'GROUP'
0
 
LVL 15

Expert Comment

by:Ess Kay
ID: 38288765
ok try this

select   VENDOR,  REGION,  sum(TOTAL_PAY_AMT) as total
from (           --Your old code

SELECT DISTINCT VENDOR, REGION, TOTAL_PAY_AMT,
  COUNT(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesCnt,
  SUM(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesTtl,
  AVG(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesAvg,
  TOTAL_PAY_AMT / SUM(TOTAL_PAY_AMT)
    OVER(PARTITION BY REGION) AS SalesPct
FROM  MASTER_CLAIM
WHERE (REGION IS NOT NULL) AND (DATEPART(YEAR, INV_DT) = @YR) AND (TOTAL_PAY_AMT > 0) )
GROUP BY VENDOR, REGION
0
 
LVL 15

Expert Comment

by:Ess Kay
ID: 38288766
if doesnt work  can you send a screenshot of the error
0
 
LVL 15

Expert Comment

by:Ess Kay
ID: 38288780
better yet, maybe its the variable. lets substituite it for this year


select   VENDOR,  REGION,  sum(TOTAL_PAY_AMT) as total
from (           --Your old code

SELECT DISTINCT VENDOR, REGION, TOTAL_PAY_AMT,
  COUNT(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesCnt,
  SUM(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesTtl,
  AVG(TOTAL_PAY_AMT) OVER(PARTITION BY REGION) AS SalesAvg,
  TOTAL_PAY_AMT / SUM(TOTAL_PAY_AMT)
    OVER(PARTITION BY REGION) AS SalesPct
FROM  MASTER_CLAIM
WHERE (REGION IS NOT NULL) AND (DATEPART(YEAR, INV_DT) = year(getdate())) --###Current year
AND (TOTAL_PAY_AMT > 0) )
GROUP BY VENDOR, REGION
0
 

Author Comment

by:britpopfan74
ID: 38289298
argh...it first gives "Incorrect syntax near ')'." so I remove one of the "(" but then argues "Column 'MASTER_CLAIM.TOTAL_PAY_AMT' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."

I will keep working around with it...thanks for your input
0
 
LVL 15

Expert Comment

by:Ess Kay
ID: 38289348
the paranthesis look fine


what if you strip it down
and change datepart to year()

try this



select   VENDOR,  REGION,  sum(TOTAL_PAY_AMT) as total_pay
from (        

SELECT DISTINCT VENDOR, REGION, TOTAL_PAY_AMT
FROM  MASTER_CLAIM
WHERE (REGION IS NOT NULL) AND
YEAR(INV_DT) = year(getdate())
AND (TOTAL_PAY_AMT > 0)

 )
GROUP BY VENDOR, REGION
0
 
LVL 75

Assisted Solution

by:Anthony Perkins
Anthony Perkins earned 800 total points
ID: 38290301
This is T-SQL not MS Access you need to alias all derived tables, as in (no points please):

select   VENDOR,  REGION,  sum(TOTAL_PAY_AMT) as total_pay
from (        

SELECT DISTINCT VENDOR, REGION, TOTAL_PAY_AMT
FROM  MASTER_CLAIM
WHERE (REGION IS NOT NULL) AND
YEAR(INV_DT) = year(getdate())
AND (TOTAL_PAY_AMT > 0)

 ) a
GROUP BY VENDOR, REGION
0
 
LVL 15

Accepted Solution

by:
Ess Kay earned 1200 total points
ID: 38290307
forgot that 'a' :) gj
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 38290308
remove it altogether its sql not C++
Incidentally, semi-colons are part of the ANSI SQL Standard and while in the past they have not been enforced, more and more commands are now requiring a semi-colon (CTE's come to mind).  In fact if you read up on BOL, most of the T-SQL examples already use them.
0
 

Author Closing Comment

by:britpopfan74
ID: 38292049
Thanks to you both -- will read up as suggested on syntax
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question