?
Solved

Help with an Aggregate and Sum Query

Posted on 2011-10-12
19
Medium Priority
?
224 Views
Last Modified: 2012-05-12
The goal is to display any dictatorid with more than one last name assigned to it along with the total of the billing lines for each of the multiple last names.

This does not quite work:
SELECT   distinct dictatorid, dictatorlastname,reportperiod, cnt,sumx
    FROM (SELECT dictatorid,
                 dictatorlastname,
                 reportperiod,
                 COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
                 SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
            FROM ichart.transactions
           WHERE reportperiod = '201109')          
   WHERE cnt > 1 and reportperiod = '201109'
ORDER BY dictatorid ASC;
 
Gives me:
 
DICTATORID DICTATORLASTNAME                         REPORTPERIOD CNT                    SUMX                  
---------- ---------------------------------------- ------------ ---------------------- ----------------------
1893NOC    Felder, K                                201109       2                      6340                  
1893NOC    Zavatsky, J                              201109       2                      7985                  
 
The actual sum of billing lines for 1893NOC in reportperiod '201109' is 6669.  Not sure what I am doing wrong.  The goal is to display any dictatorid with more than one last name assigned along with the total of the billing lines for each of the multiple last names.
 
Can some one help me tweak this?

Thanks in advance!

0
Comment
Question by:codefinger
  • 10
  • 5
  • 4
19 Comments
 
LVL 61

Expert Comment

by:HainKurt
ID: 36957165
can you give a sample data and result that you expect...

maybe there are other ways to write what you want, just a simpler way...
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957214
if you want to SUM by ID instead of name you need to change the partition clause

 SUM(BILLINGLINES) OVER (PARTITION BY DICTATORID ) sumx

but then you won't get values for each name.

maybe I'm not understanding what you're really looking for
0
 
LVL 61

Expert Comment

by:HainKurt
ID: 36957356
i could not get either :) thats why I asked for a sample data and result he is after...
0
Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

 

Author Comment

by:codefinger
ID: 36957491
Here is the requested sample and expected results requested (formatting doesn't hold together well, but hopefully you can still read  it :


SAMPLE:			
DICTATORID	DICTATORLASTNAME	BILLINGLINES	REPORTPERIOD
----------	----------------------------------------	----------------------	------------
5774BRC	Kindred, J	20	201109
5774BRC	Monaco, B	47	201109
5774BRC	Monaco, B	38	201109
3972BRC	Sledge, E	26	201109
5774BRC	Kindred, J	47	201109
5774BRC	Kindred, J	38	201109
3972BRC	Sledge, E	26	201109


EXPECTED RESULT OF QUERY			
		sumx	cnt
5774BRC	Monaco, B	85	2
5774BRC	Kindred, J	105	3

Open in new window

0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957522
the query above in your question  returns these values
I don't see a problem

the column order is different  and you took out the period but the other content is the same
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957542
oops, nevermind  I had an error  I see the count differs
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957567
I don't understand why Sledge, E   isn't in the expected results though  did you filter on DICTATORID?
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957600
assuming you meant to exclude dictator's that only had one distinct billinglines value then try this


SELECT   DISTINCT dictatorid, dictatorlastname, sumx, cnt
    FROM (SELECT dictatorid,
                 dictatorlastname,
                 reportperiod,
                 COUNT(DISTINCT billinglines) OVER (PARTITION BY dictatorid, dictatorlastname) cnt,
                 SUM(billinglines) OVER (PARTITION BY dictatorlastname) sumx
            FROM transactions
           WHERE reportperiod = '201109')
   WHERE cnt > 1
ORDER BY dictatorid ASC, cnt;
0
 
LVL 61

Expert Comment

by:HainKurt
ID: 36957769
is this what you want?

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
group by DICTATORID, DICTATORLASTNAME
having count(1) > 1

or is it too simple?

0
 
LVL 61

Expert Comment

by:HainKurt
ID: 36957771
oops, forgot table name :)

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
  from  ichart.transactions
 group by DICTATORID, DICTATORLASTNAME
having count(1) > 1
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957788
that will return Sledge, E
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957796
you could however do something like this...


Select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(distinct billinglines) cnt
  from  ichart.transactions
 group by DICTATORID, DICTATORLASTNAME
having count(distinct billinglines) > 1
0
 

Author Comment

by:codefinger
ID: 36957834
No, that brings back way too many.rows.

I want to exclude dictatorid's that have only a one-to-one match with a dictatorlastname.

The goal is to present dictatorid's that are used with more than one lastname (this is incorrect data that will have to be corrected.), along with the count of how many times the combination is used and a sum of the billing lines for each combination.

If a dictatorid appears at all, it should appear at least twice, accompanied by different last names, with a count of at least 2, with a sum of the billing lines for each of the last names for the over-used dictatorid.



0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957879
>>> No, that brings back way too many.rows.


what is "that"  ?  There are 4 queries above since you lasted posted.


 2 of the queries produce what you asked for given the data.

If they aren't correct, Can you expand your sample a little so we can see the pattern?
0
 
LVL 61

Expert Comment

by:HainKurt
ID: 36957882
according to your data the one I posted (also the one posted by sdstuber - which is almost same) should give the desired result... or come up with a different sample data... the one you posted has all same lastname, or I dont see the difference...
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 36957891
>>> according to your data the one I posted


not quite, it returns extra row for Sledge which, according to the sample, isn't correct

my followup to that one does match the sample, but possibly the sample is full enough to reveal problems in either of the queries I posted.
0
 

Author Comment

by:codefinger
ID: 36958137
sdstuber,

If it helps, this question is related to one I posted a few weeks ago, for which YOU got the points:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_27344186.html

I tried to tweak it to get the sum of billing lines for each name:

SELECT   distinct dictatorid, dictatorlastname,sumx
    FROM (SELECT dictatorid,
                 dictatorlastname,
                 COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
                 SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
            FROM ichart.transactions
           WHERE reportperiod = '201109')
   WHERE cnt > 1
ORDER BY dictatorid ASC;

...and got back something like this:

DICTATORID DICTATORLASTNAME                         SUMX                  
---------- ---------------------------------------- ----------------------
1893NOC    Felder, K                                6340                  
1893NOC    Zavatsky, J                              7985                  
30164NOC   Gunn, S                                  4239                  
30164NOC   Venturini, A                             604                    
3682BRC    Alleman, L                               767                    
3682BRC    Lindsey, S                               21245            

But when I double check the first dictatorid with an independent query:
Select Sum(billinglines) from ichart.transactions where reportperiod='201109' and dictatorid = '1893NOC';
I get 6669, NOT 14325 (6340 + 7985).
so since the simpler query returns different results, I have to assume there is something wrong with the more complicated one...

I hope this gives enough information. If not, I will keep trying.
(Its not always easy to provide enough information on the first or even subsequent tries.
I appreciate your patience.)



     






0
 
LVL 74

Accepted Solution

by:
sdstuber earned 2000 total points
ID: 36958184
yes, I remember the question

you need to partition/group by both ID and NAME


when you do something like this...
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname)
it's reporting the sum of all lines by name, regardless of which id that name might be associated with.

So, if Felder, K  is on ID 1, ID 2 and ID 3,  you're summing all of those.

Is there a problem with the queries in http:#36957600  or http:#36957796   ?

they both return the requested results for the sample data provided.

0
 

Author Comment

by:codefinger
ID: 36958711
sdstuber:

DOH! 15 posts, wasting your time and mine (and HainKurt's), waiting most of the day for that simple solution because I did not ask the question properly in the first place....lesson learned....

In answer to your other question:
The expected output from the actual data is 52 rows (that is what I got when I did not try to sum anything).

Both of the queries you reference returned over 300 rows (even after adding the reportperiod criteria to the one where it was left out).  So I will take the blame for that as well and assume there was something wrong with the sample data I provided as compared to the actual data.

At any rate, despite the length of the journey, I have the solution I came for, and that is what is most important.  Once again, THANK YOU!



0

Featured Post

 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Suggested Courses

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question